{
  "nbformat": 4,
  "nbformat_minor": 0,
  "metadata": {
    "colab": {
      "name": "pca_homework_basic.ipynb",
      "provenance": [],
      "collapsed_sections": [],
      "include_colab_link": true
    },
    "kernelspec": {
      "name": "python3",
      "display_name": "Python 3"
    }
  },
  "cells": [
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "view-in-github",
        "colab_type": "text"
      },
      "source": [
        "<a href=\"https://colab.research.google.com/github/henrygas/unsupervised_learning/blob/master/pca_homework_basic.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "AJ9o4q3OOszN",
        "colab_type": "text"
      },
      "source": [
        "## 1. 挂载google云盘"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "wo8v0dDAOont",
        "colab_type": "code",
        "outputId": "4f34f525-b8cb-4515-9dfa-2cc97fbb6b61",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 127
        }
      },
      "source": [
        "from google.colab import drive\n",
        "drive.mount(\"/content/drive\")"
      ],
      "execution_count": 2,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly\n",
            "\n",
            "Enter your authorization code:\n",
            "··········\n",
            "Mounted at /content/drive\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "gsbDPRCFPE83",
        "colab_type": "text"
      },
      "source": [
        "## 2. 切换工作目录"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "UQD5qHHhPH-y",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import os\n",
        "os.chdir(\"./drive/My Drive/app/pca_homework\")"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "g1bd9n4zPSjz",
        "colab_type": "text"
      },
      "source": [
        "## 3. 检测当前工作目录"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7amsTwGTPWAw",
        "colab_type": "code",
        "outputId": "ec11c165-5376-40dd-a3ff-032f0d6ae7fc",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        }
      },
      "source": [
        "!ls"
      ],
      "execution_count": 4,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "data  model\n"
          ],
          "name": "stdout"
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "n5lQOehjPamj",
        "colab_type": "text"
      },
      "source": [
        "## 4. 从Otto商品数据集中抽取10000条记录"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XGywtFGLP4Sl",
        "colab_type": "code",
        "outputId": "f6d7ceb5-5c98-46f1-b474-36820a0cd61b",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 253
        }
      },
      "source": [
        "import pandas as pd\n",
        "from sklearn.model_selection import train_test_split\n",
        "\n",
        "data_path = \"./data/Otto_train.csv\"\n",
        "data = pd.read_csv(data_path)\n",
        "data.head()"
      ],
      "execution_count": 13,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>feat_1</th>\n",
              "      <th>feat_2</th>\n",
              "      <th>feat_3</th>\n",
              "      <th>feat_4</th>\n",
              "      <th>feat_5</th>\n",
              "      <th>feat_6</th>\n",
              "      <th>feat_7</th>\n",
              "      <th>feat_8</th>\n",
              "      <th>feat_9</th>\n",
              "      <th>feat_10</th>\n",
              "      <th>feat_11</th>\n",
              "      <th>feat_12</th>\n",
              "      <th>feat_13</th>\n",
              "      <th>feat_14</th>\n",
              "      <th>feat_15</th>\n",
              "      <th>feat_16</th>\n",
              "      <th>feat_17</th>\n",
              "      <th>feat_18</th>\n",
              "      <th>feat_19</th>\n",
              "      <th>feat_20</th>\n",
              "      <th>feat_21</th>\n",
              "      <th>feat_22</th>\n",
              "      <th>feat_23</th>\n",
              "      <th>feat_24</th>\n",
              "      <th>feat_25</th>\n",
              "      <th>feat_26</th>\n",
              "      <th>feat_27</th>\n",
              "      <th>feat_28</th>\n",
              "      <th>feat_29</th>\n",
              "      <th>feat_30</th>\n",
              "      <th>feat_31</th>\n",
              "      <th>feat_32</th>\n",
              "      <th>feat_33</th>\n",
              "      <th>feat_34</th>\n",
              "      <th>feat_35</th>\n",
              "      <th>feat_36</th>\n",
              "      <th>feat_37</th>\n",
              "      <th>feat_38</th>\n",
              "      <th>feat_39</th>\n",
              "      <th>...</th>\n",
              "      <th>feat_55</th>\n",
              "      <th>feat_56</th>\n",
              "      <th>feat_57</th>\n",
              "      <th>feat_58</th>\n",
              "      <th>feat_59</th>\n",
              "      <th>feat_60</th>\n",
              "      <th>feat_61</th>\n",
              "      <th>feat_62</th>\n",
              "      <th>feat_63</th>\n",
              "      <th>feat_64</th>\n",
              "      <th>feat_65</th>\n",
              "      <th>feat_66</th>\n",
              "      <th>feat_67</th>\n",
              "      <th>feat_68</th>\n",
              "      <th>feat_69</th>\n",
              "      <th>feat_70</th>\n",
              "      <th>feat_71</th>\n",
              "      <th>feat_72</th>\n",
              "      <th>feat_73</th>\n",
              "      <th>feat_74</th>\n",
              "      <th>feat_75</th>\n",
              "      <th>feat_76</th>\n",
              "      <th>feat_77</th>\n",
              "      <th>feat_78</th>\n",
              "      <th>feat_79</th>\n",
              "      <th>feat_80</th>\n",
              "      <th>feat_81</th>\n",
              "      <th>feat_82</th>\n",
              "      <th>feat_83</th>\n",
              "      <th>feat_84</th>\n",
              "      <th>feat_85</th>\n",
              "      <th>feat_86</th>\n",
              "      <th>feat_87</th>\n",
              "      <th>feat_88</th>\n",
              "      <th>feat_89</th>\n",
              "      <th>feat_90</th>\n",
              "      <th>feat_91</th>\n",
              "      <th>feat_92</th>\n",
              "      <th>feat_93</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>4</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>11</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>7</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>6</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>4</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>6</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>7</td>\n",
              "      <td>2</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>58</td>\n",
              "      <td>0</td>\n",
              "      <td>10</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>22</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_1</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>4</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_1</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 95 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "   id  feat_1  feat_2  feat_3  ...  feat_91  feat_92  feat_93   target\n",
              "0   1       1       0       0  ...        0        0        0  Class_1\n",
              "1   2       0       0       0  ...        0        0        0  Class_1\n",
              "2   3       0       0       0  ...        0        0        0  Class_1\n",
              "3   4       1       0       0  ...        0        0        0  Class_1\n",
              "4   5       0       0       0  ...        0        0        0  Class_1\n",
              "\n",
              "[5 rows x 95 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 13
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "44FwD_nKQSke",
        "colab_type": "code",
        "outputId": "60611443-a954-4704-ea73-c1b1237fbec8",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 35
        }
      },
      "source": [
        "data_use, data_del = train_test_split(data, train_size=10000, random_state=0)\n",
        "data_use.shape"
      ],
      "execution_count": 14,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/plain": [
              "(10000, 95)"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 14
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "A1rjSVBce9y2",
        "colab_type": "code",
        "outputId": "df28d3ae-bcbb-4648-a1b7-c098e3d36abf",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 253
        }
      },
      "source": [
        "data_use = data_use.reset_index(drop=True)\n",
        "data_use.head()"
      ],
      "execution_count": 15,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>feat_1</th>\n",
              "      <th>feat_2</th>\n",
              "      <th>feat_3</th>\n",
              "      <th>feat_4</th>\n",
              "      <th>feat_5</th>\n",
              "      <th>feat_6</th>\n",
              "      <th>feat_7</th>\n",
              "      <th>feat_8</th>\n",
              "      <th>feat_9</th>\n",
              "      <th>feat_10</th>\n",
              "      <th>feat_11</th>\n",
              "      <th>feat_12</th>\n",
              "      <th>feat_13</th>\n",
              "      <th>feat_14</th>\n",
              "      <th>feat_15</th>\n",
              "      <th>feat_16</th>\n",
              "      <th>feat_17</th>\n",
              "      <th>feat_18</th>\n",
              "      <th>feat_19</th>\n",
              "      <th>feat_20</th>\n",
              "      <th>feat_21</th>\n",
              "      <th>feat_22</th>\n",
              "      <th>feat_23</th>\n",
              "      <th>feat_24</th>\n",
              "      <th>feat_25</th>\n",
              "      <th>feat_26</th>\n",
              "      <th>feat_27</th>\n",
              "      <th>feat_28</th>\n",
              "      <th>feat_29</th>\n",
              "      <th>feat_30</th>\n",
              "      <th>feat_31</th>\n",
              "      <th>feat_32</th>\n",
              "      <th>feat_33</th>\n",
              "      <th>feat_34</th>\n",
              "      <th>feat_35</th>\n",
              "      <th>feat_36</th>\n",
              "      <th>feat_37</th>\n",
              "      <th>feat_38</th>\n",
              "      <th>feat_39</th>\n",
              "      <th>...</th>\n",
              "      <th>feat_55</th>\n",
              "      <th>feat_56</th>\n",
              "      <th>feat_57</th>\n",
              "      <th>feat_58</th>\n",
              "      <th>feat_59</th>\n",
              "      <th>feat_60</th>\n",
              "      <th>feat_61</th>\n",
              "      <th>feat_62</th>\n",
              "      <th>feat_63</th>\n",
              "      <th>feat_64</th>\n",
              "      <th>feat_65</th>\n",
              "      <th>feat_66</th>\n",
              "      <th>feat_67</th>\n",
              "      <th>feat_68</th>\n",
              "      <th>feat_69</th>\n",
              "      <th>feat_70</th>\n",
              "      <th>feat_71</th>\n",
              "      <th>feat_72</th>\n",
              "      <th>feat_73</th>\n",
              "      <th>feat_74</th>\n",
              "      <th>feat_75</th>\n",
              "      <th>feat_76</th>\n",
              "      <th>feat_77</th>\n",
              "      <th>feat_78</th>\n",
              "      <th>feat_79</th>\n",
              "      <th>feat_80</th>\n",
              "      <th>feat_81</th>\n",
              "      <th>feat_82</th>\n",
              "      <th>feat_83</th>\n",
              "      <th>feat_84</th>\n",
              "      <th>feat_85</th>\n",
              "      <th>feat_86</th>\n",
              "      <th>feat_87</th>\n",
              "      <th>feat_88</th>\n",
              "      <th>feat_89</th>\n",
              "      <th>feat_90</th>\n",
              "      <th>feat_91</th>\n",
              "      <th>feat_92</th>\n",
              "      <th>feat_93</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>7898</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11288</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>9</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>10356</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>13439</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>8</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>3</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>54130</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>2</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>15</td>\n",
              "      <td>0</td>\n",
              "      <td>5</td>\n",
              "      <td>5</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>Class_8</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 95 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "      id  feat_1  feat_2  feat_3  ...  feat_91  feat_92  feat_93   target\n",
              "0   7898       0       0       0  ...        0        0        0  Class_2\n",
              "1  11288       0       0       0  ...        0        0        0  Class_2\n",
              "2  10356       0       0       0  ...        0        0        0  Class_2\n",
              "3  13439       1       0       0  ...        0        0        0  Class_2\n",
              "4  54130       0       0       0  ...        0        0        0  Class_8\n",
              "\n",
              "[5 rows x 95 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 15
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "bhvQ8gLwp3S4",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "org_data_path = \"./data/Otto_FE_train_org.csv\"\n",
        "data_use.to_csv(org_data_path, index=False, header=True)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "aAGv5eYyROv0",
        "colab_type": "text"
      },
      "source": [
        "## 5. 将数据切分为id, X_train, target三部分"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XA1AK17LRe1s",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "id = data_use[\"id\"]\n",
        "target = data_use[\"target\"]\n",
        "X_train = data_use.drop([\"id\", \"target\"], axis=1)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qGvPA0QXRs0_",
        "colab_type": "text"
      },
      "source": [
        "## 6. 对特征列X_train进行PCA降维,并观察方差在各个成分上的分布"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "XK95RtgOR2vw",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from sklearn.decomposition import PCA\n",
        "pca = PCA(n_components=0.85)\n",
        "pca.fit(X_train)\n",
        "X_train_pca = pca.transform(X_train)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "4ADTuB05STti",
        "colab_type": "code",
        "outputId": "a090dbd6-f9d2-48dd-eace-a57f6f8dc6ac",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 265
        }
      },
      "source": [
        "import matplotlib.pyplot as plt\n",
        "%matplotlib inline\n",
        "\n",
        "plt.bar(range(len(pca.explained_variance_ratio_)), pca.explained_variance_ratio_)\n",
        "plt.show()"
      ],
      "execution_count": 19,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAX8AAAD4CAYAAAAEhuazAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAR5klEQVR4nO3df6zd9V3H8efLdmWTKTC4WSawtBPU\ndM6g64omE5ctY0XiqrFsZbqBwaCJTWb82WnCsGoCZo6ZDHVVUAbOQnDTG+lEEpZMl8l6QWAWRO9Y\nN1rnuAPGZAtjhbd/nG/jyfH23nN/ntP7eT6Spt/v5/v5nvs+3/S+vp/z+X7Pt6kqJElt+bZRFyBJ\nWn2GvyQ1yPCXpAYZ/pLUIMNfkhq0ftQFDDrjjDNq48aNoy5Dkk4o995771eqamLY/mMX/hs3bmRq\namrUZUjSCSXJFxbS32kfSWqQ4S9JDTL8JalBhr8kNWio8E+yLckjSaaT7J5l+wVJ7ktyNMmOvvbz\nknw6ycEkDyZ5+3IWL0lanHnDP8k64HrgImAzcGmSzQPdvghcDnxkoP0bwLuq6tXANuADSU5datGS\npKUZ5lbPrcB0VT0KkGQfsB146FiHqjrUbXuhf8eq+o++5f9K8jgwAXx1yZVLkhZtmGmfM4HH+tYP\nd20LkmQrsAH43CzbrkwylWRqZmZmoS8tSVqgVbngm+QVwM3Az1XVC4Pbq2pvVW2pqi0TE0N/QU2S\ntEjDTPscAc7uWz+raxtKku8E7gB+u6r+ZWHlLdzG3Xccd9uhay5e6R8vSSeEYUb+B4Bzk2xKsgHY\nCUwO8+Jd/48BH66q2xdfpiRpOc0b/lV1FNgF3Ak8DNxWVQeT7EnyVoAkr0tyGLgE+FCSg93ubwMu\nAC5Pcn/357wVeSeSpKEN9WC3qtoP7B9ou6pv+QC96aDB/W4BbllijZKkZeY3fCWpQYa/JDXI8Jek\nBhn+ktQgw1+SGmT4S1KDDH9JapDhL0kNMvwlqUGGvyQ1yPCXpAYZ/pLUIMNfkhpk+EtSgwx/SWqQ\n4S9JDTL8JalBhr8kNcjwl6QGGf6S1CDDX5IaZPhLUoMMf0lqkOEvSQ0y/CWpQYa/JDVoqPBPsi3J\nI0mmk+yeZfsFSe5LcjTJjoFtlyX5z+7PZctVuCRp8eYN/yTrgOuBi4DNwKVJNg90+yJwOfCRgX1f\nBrwXOB/YCrw3yWlLL1uStBTDjPy3AtNV9WhVPQfsA7b3d6iqQ1X1IPDCwL5vAe6qqier6ingLmDb\nMtQtSVqCYcL/TOCxvvXDXdswhto3yZVJppJMzczMDPnSkqTFGosLvlW1t6q2VNWWiYmJUZcjSWve\nMOF/BDi7b/2srm0YS9lXkrRChgn/A8C5STYl2QDsBCaHfP07gQuTnNZd6L2wa5MkjdC84V9VR4Fd\n9EL7YeC2qjqYZE+StwIkeV2Sw8AlwIeSHOz2fRL4XXonkAPAnq5NkjRC64fpVFX7gf0DbVf1LR+g\nN6Uz2743AjcuoUZJ0jIbiwu+kqTVZfhLUoMMf0lqkOEvSQ0y/CWpQYa/JDXI8JekBhn+ktQgw1+S\nGmT4S1KDDH9JapDhL0kNMvwlqUGGvyQ1aKhHOq81G3ffcdxth665eBUrkaTRcOQvSQ0y/CWpQYa/\nJDXI8JekBhn+ktQgw1+SGmT4S1KDDH9JapDhL0kNMvwlqUFDhX+SbUkeSTKdZPcs209Kcmu3/Z4k\nG7v2FyW5Kclnkzyc5D3LW74kaTHmDf8k64DrgYuAzcClSTYPdLsCeKqqzgGuA67t2i8BTqqq1wCv\nBX7h2IlBkjQ6w4z8twLTVfVoVT0H7AO2D/TZDtzULd8OvClJgAJOTrIeeAnwHPC1ZalckrRow4T/\nmcBjfeuHu7ZZ+1TVUeBp4HR6J4KvA18Cvgi8r6qeXGLNkqQlWukLvluB54HvAjYBv5rkVYOdklyZ\nZCrJ1MzMzAqXJEkaJvyPAGf3rZ/Vtc3ap5viOQV4AngH8A9V9a2qehz4FLBl8AdU1d6q2lJVWyYm\nJhb+LiRJCzJM+B8Azk2yKckGYCcwOdBnErisW94B3F1VRW+q540ASU4Gfhj49+UoXJK0ePOGfzeH\nvwu4E3gYuK2qDibZk+StXbcbgNOTTAO/Ahy7HfR64KVJDtI7ifxFVT243G9CkrQwQ/03jlW1H9g/\n0HZV3/Kz9G7rHNzvmdnaJUmj5Td8JalBhr8kNcjwl6QGGf6S1CDDX5IaZPhLUoMMf0lqkOEvSQ0y\n/CWpQYa/JDXI8JekBhn+ktQgw1+SGmT4S1KDDH9JapDhL0kNMvwlqUGGvyQ1yPCXpAYZ/pLUIMNf\nkhpk+EtSgwx/SWqQ4S9JDTL8JalBhr8kNWio8E+yLckjSaaT7J5l+0lJbu2235NkY9+2H0jy6SQH\nk3w2yYuXr3xJ0mKsn69DknXA9cCbgcPAgSSTVfVQX7crgKeq6pwkO4FrgbcnWQ/cAryzqh5Icjrw\nrWV/Fytg4+47jrvt0DUXr2IlkrT8hhn5bwWmq+rRqnoO2AdsH+izHbipW74deFOSABcCD1bVAwBV\n9URVPb88pUuSFmuY8D8TeKxv/XDXNmufqjoKPA2cDnwPUEnuTHJfkt+Y7QckuTLJVJKpmZmZhb4H\nSdICrfQF3/XA64Gf6f7+qSRvGuxUVXuraktVbZmYmFjhkiRJw4T/EeDsvvWzurZZ+3Tz/KcAT9D7\nlPDJqvpKVX0D2A/80FKLliQtzTDhfwA4N8mmJBuAncDkQJ9J4LJueQdwd1UVcCfwmiTf3p0Ufgx4\nCEnSSM17t09VHU2yi16QrwNurKqDSfYAU1U1CdwA3JxkGniS3gmCqnoqyfvpnUAK2F9Vx7+NRpK0\nKuYNf4Cq2k9vyqa/7aq+5WeBS46z7y30bveUJI0Jv+ErSQ0y/CWpQYa/JDXI8JekBhn+ktQgw1+S\nGmT4S1KDDH9JapDhL0kNMvwlqUGGvyQ1yPCXpAYZ/pLUIMNfkhpk+EtSgwx/SWqQ4S9JDRrqf/LS\n7DbuPv7/SHnomotXsRJJWhhH/pLUIMNfkhpk+EtSgwx/SWqQ4S9JDfJunxXmHUGSxpEjf0lqkOEv\nSQ0aKvyTbEvySJLpJLtn2X5Sklu77fck2Tiw/ZVJnknya8tTtiRpKeYN/yTrgOuBi4DNwKVJNg90\nuwJ4qqrOAa4Drh3Y/n7g40svV5K0HIa54LsVmK6qRwGS7AO2Aw/19dkOXN0t3w58MEmqqpL8JPB5\n4OvLVvUa40VhSattmGmfM4HH+tYPd22z9qmqo8DTwOlJXgr8JvA7c/2AJFcmmUoyNTMzM2ztkqRF\nWukLvlcD11XVM3N1qqq9VbWlqrZMTEyscEmSpGGmfY4AZ/etn9W1zdbncJL1wCnAE8D5wI4kfwCc\nCryQ5Nmq+uCSK5ckLdow4X8AODfJJnohvxN4x0CfSeAy4NPADuDuqirgR491SHI18IzBL0mjN2/4\nV9XRJLuAO4F1wI1VdTDJHmCqqiaBG4Cbk0wDT9I7QUiSxtRQj3eoqv3A/oG2q/qWnwUumec1rl5E\nfZKkFeA3fCWpQT7Y7QThdwEkLSfDfw3xBCFpWE77SFKDDH9JapDhL0kNMvwlqUGGvyQ1yPCXpAYZ\n/pLUIMNfkhpk+EtSg/yGb2P8FrAkcOQvSU1y5K//x08H0trnyF+SGmT4S1KDnPbRojg1JJ3YHPlL\nUoMMf0lqkOEvSQ0y/CWpQYa/JDXIu320YrwjSBpfjvwlqUGGvyQ1aKhpnyTbgD8C1gF/XlXXDGw/\nCfgw8FrgCeDtVXUoyZuBa4ANwHPAr1fV3ctYv05wTg1JozHvyD/JOuB64CJgM3Bpks0D3a4Anqqq\nc4DrgGu79q8AP1FVrwEuA25ersIlSYs3zMh/KzBdVY8CJNkHbAce6uuzHbi6W74d+GCSVNW/9vU5\nCLwkyUlV9c0lV65mDPPpwE8Q0sIMM+d/JvBY3/rhrm3WPlV1FHgaOH2gz08D980W/EmuTDKVZGpm\nZmbY2iVJi7Qqt3omeTW9qaALZ9teVXuBvQBbtmyp1ahJ7fHTgfR/hhn5HwHO7ls/q2ubtU+S9cAp\n9C78kuQs4GPAu6rqc0stWJK0dMOE/wHg3CSbkmwAdgKTA30m6V3QBdgB3F1VleRU4A5gd1V9armK\nliQtzbzTPlV1NMku4E56t3reWFUHk+wBpqpqErgBuDnJNPAkvRMEwC7gHOCqJFd1bRdW1ePL/Uak\n5eDUkFox1Jx/Ve0H9g+0XdW3/CxwySz7/R7we0usUZK0zHy2j7RAfjrQWuDjHSSpQY78pRXgpwON\nO0f+ktQgR/7SiPjpQKPkyF+SGmT4S1KDnPaRxphPNNVKceQvSQ1y5C81wE8HGmT4SwI8QbTG8Jc0\nNK9BrB3O+UtSgxz5S1p1fjoYPcNf0ljyBLGyDH9JJyyvQSye4S+peS2eIAx/SRrCcn3KGJcTjXf7\nSFKDDH9JapDhL0kNMvwlqUGGvyQ1yPCXpAYZ/pLUIMNfkho0VPgn2ZbkkSTTSXbPsv2kJLd22+9J\nsrFv23u69keSvGX5SpckLda84Z9kHXA9cBGwGbg0yeaBblcAT1XVOcB1wLXdvpuBncCrgW3AH3ev\nJ0kaoWFG/luB6ap6tKqeA/YB2wf6bAdu6pZvB96UJF37vqr6ZlV9HpjuXk+SNEKpqrk7JDuAbVX1\n8936O4Hzq2pXX59/6/oc7tY/B5wPXA38S1Xd0rXfAHy8qm4f+BlXAld2q98LPLL0twbAGcBXlum1\nVos1rw5rXh3WvDrOAE6uqolhdxiLB7tV1V5g73K/bpKpqtqy3K+7kqx5dVjz6rDm1dHVvHEh+wwz\n7XMEOLtv/ayubdY+SdYDpwBPDLmvJGmVDRP+B4Bzk2xKsoHeBdzJgT6TwGXd8g7g7urNJ00CO7u7\ngTYB5wKfWZ7SJUmLNe+0T1UdTbILuBNYB9xYVQeT7AGmqmoSuAG4Ock08CS9EwRdv9uAh4CjwC9V\n1fMr9F5ms+xTSavAmleHNa8Oa14dC6553gu+kqS1x2/4SlKDDH9JatCaDf/5HkkxjpIcSvLZJPcn\nmRp1PbNJcmOSx7vvdhxre1mSu5L8Z/f3aaOscdBxar46yZHuWN+f5MdHWeOgJGcn+USSh5IcTPLu\nrn1sj/UcNY/tsU7y4iSfSfJAV/PvdO2bukfVTHePrtkw6lqPmaPmv0zy+b7jfN6cr7MW5/y7R0j8\nB/Bm4DC9O5YuraqHRlrYPJIcArZU1dh+wSTJBcAzwIer6vu7tj8Anqyqa7oT7WlV9ZujrLPfcWq+\nGnimqt43ytqOJ8krgFdU1X1JvgO4F/hJ4HLG9FjPUfPbGNNj3T2J4OSqeibJi4B/Bt4N/Arw0ara\nl+RPgQeq6k9GWesxc9T8i8DfD36J9njW6sh/mEdSaBGq6pP07ujq1/94j5vo/cKPjePUPNaq6ktV\ndV+3/D/Aw8CZjPGxnqPmsVU9z3SrL+r+FPBGeo+qgfE7zsereUHWavifCTzWt36YMf9H2CngH5Pc\n2z3y4kTx8qr6Urf838DLR1nMAuxK8mA3LTQ20yeDuqfk/iBwDyfIsR6oGcb4WCdZl+R+4HHgLuBz\nwFer6mjXZezyY7Dmqjp2nH+/O87XJTlprtdYq+F/onp9Vf0QvSeo/lI3XXFC6b7cdyLMJf4J8N3A\necCXgD8cbTmzS/JS4G+AX66qr/VvG9djPUvNY32sq+r5qjqP3hMItgLfN+KS5jVYc5LvB95Dr/bX\nAS8D5pwOXKvhf0I+VqKqjnR/Pw58jBPnCahf7uZ7j837Pj7ieuZVVV/ufoFeAP6MMTzW3Xzu3wB/\nVVUf7ZrH+ljPVvOJcKwBquqrwCeAHwFO7R5VA2OcH301b+um3aqqvgn8BfMc57Ua/sM8kmKsJDm5\nu0hGkpOBC4F/m3uvsdH/eI/LgL8bYS1DORagnZ9izI51d1HvBuDhqnp/36axPdbHq3mcj3WSiSSn\ndssvoXeTyMP0AnVH123cjvNsNf9736Ag9K5RzHmc1+TdPgDd7WQf4P8eSfH7Iy5pTkleRW+0D73H\nbnxkHGtO8tfAG+g9QvbLwHuBvwVuA14JfAF4W1WNzQXW49T8BnrTEAUcAn6hby595JK8Hvgn4LPA\nC13zb9GbQx/LYz1HzZcypsc6yQ/Qu6C7jt5g+Laq2tP9Pu6jN33yr8DPdiPqkZuj5ruBCSDA/cAv\n9l0Y/v+vs1bDX5J0fGt12keSNAfDX5IaZPhLUoMMf0lqkOEvSQ0y/CWpQYa/JDXofwENGdCZRpiC\nZwAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "wEX_HPH9SzCP",
        "colab_type": "text"
      },
      "source": [
        "**可以看到前3个主成分的方差，占据了绝大部分的方差。**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "LGxWPrvSrOjd",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 224
        },
        "outputId": "37e35208-6458-4515-fcf5-9b47ab0a058e"
      },
      "source": [
        "n_components = pca.n_components_\n",
        "org_pca_feature_names = list()\n",
        "for i in range(n_components):\n",
        "  org_pca_feature_names.append(\"pca_{}\".format(i))\n",
        "\n",
        "X_train_org_pca_df = pd.DataFrame(data=X_train_pca, columns=org_pca_feature_names)\n",
        "\n",
        "org_pca_data = pd.concat([id, X_train_org_pca_df, target], axis=1)\n",
        "org_pca_data_path = \"./data/Otto_FE_train_pca.csv\"\n",
        "org_pca_data.to_csv(org_pca_data_path, index=False, header=True)\n",
        "\n",
        "org_pca_data.head()\n"
      ],
      "execution_count": 21,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>pca_0</th>\n",
              "      <th>pca_1</th>\n",
              "      <th>pca_2</th>\n",
              "      <th>pca_3</th>\n",
              "      <th>pca_4</th>\n",
              "      <th>pca_5</th>\n",
              "      <th>pca_6</th>\n",
              "      <th>pca_7</th>\n",
              "      <th>pca_8</th>\n",
              "      <th>pca_9</th>\n",
              "      <th>pca_10</th>\n",
              "      <th>pca_11</th>\n",
              "      <th>pca_12</th>\n",
              "      <th>pca_13</th>\n",
              "      <th>pca_14</th>\n",
              "      <th>pca_15</th>\n",
              "      <th>pca_16</th>\n",
              "      <th>pca_17</th>\n",
              "      <th>pca_18</th>\n",
              "      <th>pca_19</th>\n",
              "      <th>pca_20</th>\n",
              "      <th>pca_21</th>\n",
              "      <th>pca_22</th>\n",
              "      <th>pca_23</th>\n",
              "      <th>pca_24</th>\n",
              "      <th>pca_25</th>\n",
              "      <th>pca_26</th>\n",
              "      <th>pca_27</th>\n",
              "      <th>pca_28</th>\n",
              "      <th>pca_29</th>\n",
              "      <th>pca_30</th>\n",
              "      <th>pca_31</th>\n",
              "      <th>pca_32</th>\n",
              "      <th>pca_33</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>7898</td>\n",
              "      <td>-4.536745</td>\n",
              "      <td>-2.688173</td>\n",
              "      <td>-2.748430</td>\n",
              "      <td>1.058662</td>\n",
              "      <td>-0.708950</td>\n",
              "      <td>-0.882693</td>\n",
              "      <td>0.666887</td>\n",
              "      <td>0.395106</td>\n",
              "      <td>-0.203800</td>\n",
              "      <td>-0.505636</td>\n",
              "      <td>-0.199060</td>\n",
              "      <td>0.297237</td>\n",
              "      <td>0.618093</td>\n",
              "      <td>-0.456655</td>\n",
              "      <td>0.548557</td>\n",
              "      <td>0.891607</td>\n",
              "      <td>-0.858015</td>\n",
              "      <td>0.008375</td>\n",
              "      <td>0.619947</td>\n",
              "      <td>-0.253042</td>\n",
              "      <td>0.027602</td>\n",
              "      <td>0.344279</td>\n",
              "      <td>-0.204781</td>\n",
              "      <td>-0.778618</td>\n",
              "      <td>0.516475</td>\n",
              "      <td>0.360841</td>\n",
              "      <td>0.612916</td>\n",
              "      <td>0.719602</td>\n",
              "      <td>0.397100</td>\n",
              "      <td>0.744705</td>\n",
              "      <td>0.023056</td>\n",
              "      <td>0.208455</td>\n",
              "      <td>-0.100675</td>\n",
              "      <td>0.310035</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11288</td>\n",
              "      <td>-2.660599</td>\n",
              "      <td>-4.943433</td>\n",
              "      <td>-0.343025</td>\n",
              "      <td>-1.748000</td>\n",
              "      <td>0.938032</td>\n",
              "      <td>-2.598540</td>\n",
              "      <td>4.736238</td>\n",
              "      <td>-2.767809</td>\n",
              "      <td>-0.572814</td>\n",
              "      <td>-0.745355</td>\n",
              "      <td>1.786481</td>\n",
              "      <td>0.237307</td>\n",
              "      <td>1.645473</td>\n",
              "      <td>0.848775</td>\n",
              "      <td>2.397526</td>\n",
              "      <td>-0.435221</td>\n",
              "      <td>-0.251499</td>\n",
              "      <td>1.200504</td>\n",
              "      <td>-1.516362</td>\n",
              "      <td>0.322930</td>\n",
              "      <td>7.747328</td>\n",
              "      <td>-3.669908</td>\n",
              "      <td>-0.390975</td>\n",
              "      <td>-1.137763</td>\n",
              "      <td>-4.477462</td>\n",
              "      <td>-2.791136</td>\n",
              "      <td>-0.968622</td>\n",
              "      <td>-0.343393</td>\n",
              "      <td>-3.173220</td>\n",
              "      <td>-2.174692</td>\n",
              "      <td>2.144261</td>\n",
              "      <td>-2.911447</td>\n",
              "      <td>2.168579</td>\n",
              "      <td>1.229423</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>10356</td>\n",
              "      <td>-2.088048</td>\n",
              "      <td>-3.416626</td>\n",
              "      <td>0.255663</td>\n",
              "      <td>0.989287</td>\n",
              "      <td>-0.168655</td>\n",
              "      <td>-2.035494</td>\n",
              "      <td>-1.574452</td>\n",
              "      <td>-0.477458</td>\n",
              "      <td>-1.770014</td>\n",
              "      <td>1.294337</td>\n",
              "      <td>0.997483</td>\n",
              "      <td>-1.570086</td>\n",
              "      <td>1.621107</td>\n",
              "      <td>-0.364949</td>\n",
              "      <td>1.370738</td>\n",
              "      <td>0.518049</td>\n",
              "      <td>-0.319211</td>\n",
              "      <td>0.020575</td>\n",
              "      <td>0.300617</td>\n",
              "      <td>-0.797239</td>\n",
              "      <td>1.408778</td>\n",
              "      <td>0.042296</td>\n",
              "      <td>0.221874</td>\n",
              "      <td>-1.054702</td>\n",
              "      <td>0.187690</td>\n",
              "      <td>-0.915440</td>\n",
              "      <td>-1.538772</td>\n",
              "      <td>-0.404660</td>\n",
              "      <td>-1.049678</td>\n",
              "      <td>-0.093555</td>\n",
              "      <td>-0.162868</td>\n",
              "      <td>-0.188923</td>\n",
              "      <td>0.298911</td>\n",
              "      <td>-0.217679</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>13439</td>\n",
              "      <td>-2.709519</td>\n",
              "      <td>-4.363292</td>\n",
              "      <td>-1.177761</td>\n",
              "      <td>-1.969763</td>\n",
              "      <td>0.732724</td>\n",
              "      <td>-1.769069</td>\n",
              "      <td>4.258412</td>\n",
              "      <td>-2.027677</td>\n",
              "      <td>-1.058451</td>\n",
              "      <td>-0.990625</td>\n",
              "      <td>0.441116</td>\n",
              "      <td>0.555796</td>\n",
              "      <td>0.822412</td>\n",
              "      <td>0.095477</td>\n",
              "      <td>1.715214</td>\n",
              "      <td>-0.110388</td>\n",
              "      <td>-0.141631</td>\n",
              "      <td>-0.079653</td>\n",
              "      <td>-0.658667</td>\n",
              "      <td>-0.073966</td>\n",
              "      <td>4.146606</td>\n",
              "      <td>-1.421027</td>\n",
              "      <td>-1.231516</td>\n",
              "      <td>-0.611851</td>\n",
              "      <td>-2.034836</td>\n",
              "      <td>-0.571494</td>\n",
              "      <td>-0.017538</td>\n",
              "      <td>-0.438237</td>\n",
              "      <td>-0.436407</td>\n",
              "      <td>-1.216706</td>\n",
              "      <td>0.053717</td>\n",
              "      <td>0.022295</td>\n",
              "      <td>-0.217739</td>\n",
              "      <td>0.569096</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>54130</td>\n",
              "      <td>1.054912</td>\n",
              "      <td>-2.519082</td>\n",
              "      <td>-3.391388</td>\n",
              "      <td>9.309992</td>\n",
              "      <td>12.883049</td>\n",
              "      <td>-0.779819</td>\n",
              "      <td>-1.786949</td>\n",
              "      <td>0.528708</td>\n",
              "      <td>-0.594271</td>\n",
              "      <td>-1.548571</td>\n",
              "      <td>-0.922574</td>\n",
              "      <td>-0.756413</td>\n",
              "      <td>0.773375</td>\n",
              "      <td>-0.303419</td>\n",
              "      <td>-0.104821</td>\n",
              "      <td>0.228923</td>\n",
              "      <td>-1.332062</td>\n",
              "      <td>-0.857397</td>\n",
              "      <td>-0.058820</td>\n",
              "      <td>-1.080153</td>\n",
              "      <td>-0.448226</td>\n",
              "      <td>-0.846956</td>\n",
              "      <td>-0.456221</td>\n",
              "      <td>0.012550</td>\n",
              "      <td>0.195758</td>\n",
              "      <td>-0.093331</td>\n",
              "      <td>0.252027</td>\n",
              "      <td>-0.334593</td>\n",
              "      <td>-0.430648</td>\n",
              "      <td>0.873963</td>\n",
              "      <td>1.029147</td>\n",
              "      <td>1.589061</td>\n",
              "      <td>0.257745</td>\n",
              "      <td>-0.350156</td>\n",
              "      <td>Class_8</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "      id     pca_0     pca_1     pca_2  ...    pca_31    pca_32    pca_33   target\n",
              "0   7898 -4.536745 -2.688173 -2.748430  ...  0.208455 -0.100675  0.310035  Class_2\n",
              "1  11288 -2.660599 -4.943433 -0.343025  ... -2.911447  2.168579  1.229423  Class_2\n",
              "2  10356 -2.088048 -3.416626  0.255663  ... -0.188923  0.298911 -0.217679  Class_2\n",
              "3  13439 -2.709519 -4.363292 -1.177761  ...  0.022295 -0.217739  0.569096  Class_2\n",
              "4  54130  1.054912 -2.519082 -3.391388  ...  1.589061  0.257745 -0.350156  Class_8\n",
              "\n",
              "[5 rows x 36 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 21
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "qrEnnpVVUebv",
        "colab_type": "text"
      },
      "source": [
        "## 7. 对原始特征进行tf-idf变换"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "rb1vQQgsUmXV",
        "colab_type": "code",
        "outputId": "d158297d-f73c-43c6-89bd-4e3461069ce7",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 253
        }
      },
      "source": [
        "from sklearn.feature_extraction.text import TfidfTransformer\n",
        "\n",
        "tfidf = TfidfTransformer()\n",
        "X_train_tfidf_csr_matrix = tfidf.fit_transform(X_train)\n",
        "\n",
        "tfidf_feature_names = X_train.columns + \"_tfidf\"\n",
        "X_train_tfidf = pd.DataFrame(data=X_train_tfidf_csr_matrix.toarray(), columns=tfidf_feature_names)\n",
        "X_train_tfidf.head()"
      ],
      "execution_count": 22,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>feat_1_tfidf</th>\n",
              "      <th>feat_2_tfidf</th>\n",
              "      <th>feat_3_tfidf</th>\n",
              "      <th>feat_4_tfidf</th>\n",
              "      <th>feat_5_tfidf</th>\n",
              "      <th>feat_6_tfidf</th>\n",
              "      <th>feat_7_tfidf</th>\n",
              "      <th>feat_8_tfidf</th>\n",
              "      <th>feat_9_tfidf</th>\n",
              "      <th>feat_10_tfidf</th>\n",
              "      <th>feat_11_tfidf</th>\n",
              "      <th>feat_12_tfidf</th>\n",
              "      <th>feat_13_tfidf</th>\n",
              "      <th>feat_14_tfidf</th>\n",
              "      <th>feat_15_tfidf</th>\n",
              "      <th>feat_16_tfidf</th>\n",
              "      <th>feat_17_tfidf</th>\n",
              "      <th>feat_18_tfidf</th>\n",
              "      <th>feat_19_tfidf</th>\n",
              "      <th>feat_20_tfidf</th>\n",
              "      <th>feat_21_tfidf</th>\n",
              "      <th>feat_22_tfidf</th>\n",
              "      <th>feat_23_tfidf</th>\n",
              "      <th>feat_24_tfidf</th>\n",
              "      <th>feat_25_tfidf</th>\n",
              "      <th>feat_26_tfidf</th>\n",
              "      <th>feat_27_tfidf</th>\n",
              "      <th>feat_28_tfidf</th>\n",
              "      <th>feat_29_tfidf</th>\n",
              "      <th>feat_30_tfidf</th>\n",
              "      <th>feat_31_tfidf</th>\n",
              "      <th>feat_32_tfidf</th>\n",
              "      <th>feat_33_tfidf</th>\n",
              "      <th>feat_34_tfidf</th>\n",
              "      <th>feat_35_tfidf</th>\n",
              "      <th>feat_36_tfidf</th>\n",
              "      <th>feat_37_tfidf</th>\n",
              "      <th>feat_38_tfidf</th>\n",
              "      <th>feat_39_tfidf</th>\n",
              "      <th>feat_40_tfidf</th>\n",
              "      <th>...</th>\n",
              "      <th>feat_54_tfidf</th>\n",
              "      <th>feat_55_tfidf</th>\n",
              "      <th>feat_56_tfidf</th>\n",
              "      <th>feat_57_tfidf</th>\n",
              "      <th>feat_58_tfidf</th>\n",
              "      <th>feat_59_tfidf</th>\n",
              "      <th>feat_60_tfidf</th>\n",
              "      <th>feat_61_tfidf</th>\n",
              "      <th>feat_62_tfidf</th>\n",
              "      <th>feat_63_tfidf</th>\n",
              "      <th>feat_64_tfidf</th>\n",
              "      <th>feat_65_tfidf</th>\n",
              "      <th>feat_66_tfidf</th>\n",
              "      <th>feat_67_tfidf</th>\n",
              "      <th>feat_68_tfidf</th>\n",
              "      <th>feat_69_tfidf</th>\n",
              "      <th>feat_70_tfidf</th>\n",
              "      <th>feat_71_tfidf</th>\n",
              "      <th>feat_72_tfidf</th>\n",
              "      <th>feat_73_tfidf</th>\n",
              "      <th>feat_74_tfidf</th>\n",
              "      <th>feat_75_tfidf</th>\n",
              "      <th>feat_76_tfidf</th>\n",
              "      <th>feat_77_tfidf</th>\n",
              "      <th>feat_78_tfidf</th>\n",
              "      <th>feat_79_tfidf</th>\n",
              "      <th>feat_80_tfidf</th>\n",
              "      <th>feat_81_tfidf</th>\n",
              "      <th>feat_82_tfidf</th>\n",
              "      <th>feat_83_tfidf</th>\n",
              "      <th>feat_84_tfidf</th>\n",
              "      <th>feat_85_tfidf</th>\n",
              "      <th>feat_86_tfidf</th>\n",
              "      <th>feat_87_tfidf</th>\n",
              "      <th>feat_88_tfidf</th>\n",
              "      <th>feat_89_tfidf</th>\n",
              "      <th>feat_90_tfidf</th>\n",
              "      <th>feat_91_tfidf</th>\n",
              "      <th>feat_92_tfidf</th>\n",
              "      <th>feat_93_tfidf</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.682684</td>\n",
              "      <td>0.278016</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.180011</td>\n",
              "      <td>0.198141</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.439853</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.256477</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.235994</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.264654</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.569271</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.110149</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.104231</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.154495</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.065596</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.073562</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.197374</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.193312</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.270962</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.158432</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.633419</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.384295</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.322727</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>0.128273</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.668426</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.066094</td>\n",
              "      <td>0.218253</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.098477</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.161500</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.140497</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.204081</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.103180</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.086649</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.194345</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.097954</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.057593</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.077535</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.061993</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.118274</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.129850</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.062872</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.88743</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.273109</td>\n",
              "      <td>0.265582</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.066991</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.064113</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 93 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "   feat_1_tfidf  feat_2_tfidf  ...  feat_92_tfidf  feat_93_tfidf\n",
              "0      0.000000           0.0  ...            0.0            0.0\n",
              "1      0.000000           0.0  ...            0.0            0.0\n",
              "2      0.000000           0.0  ...            0.0            0.0\n",
              "3      0.128273           0.0  ...            0.0            0.0\n",
              "4      0.000000           0.0  ...            0.0            0.0\n",
              "\n",
              "[5 rows x 93 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 22
        }
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "ob44xPVeYgxe",
        "colab_type": "code",
        "outputId": "aea73c18-b2c5-46e2-ca03-6b6befd010d4",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 253
        }
      },
      "source": [
        "tfidf_data = pd.concat([id, X_train_tfidf, target], axis=1)\n",
        "tfidf_data_path = \"./data/Otto_FE_train_tfidf.csv\"\n",
        "tfidf_data.to_csv(tfidf_data_path, index=False, header=True)\n",
        "tfidf_data.head()"
      ],
      "execution_count": 23,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>feat_1_tfidf</th>\n",
              "      <th>feat_2_tfidf</th>\n",
              "      <th>feat_3_tfidf</th>\n",
              "      <th>feat_4_tfidf</th>\n",
              "      <th>feat_5_tfidf</th>\n",
              "      <th>feat_6_tfidf</th>\n",
              "      <th>feat_7_tfidf</th>\n",
              "      <th>feat_8_tfidf</th>\n",
              "      <th>feat_9_tfidf</th>\n",
              "      <th>feat_10_tfidf</th>\n",
              "      <th>feat_11_tfidf</th>\n",
              "      <th>feat_12_tfidf</th>\n",
              "      <th>feat_13_tfidf</th>\n",
              "      <th>feat_14_tfidf</th>\n",
              "      <th>feat_15_tfidf</th>\n",
              "      <th>feat_16_tfidf</th>\n",
              "      <th>feat_17_tfidf</th>\n",
              "      <th>feat_18_tfidf</th>\n",
              "      <th>feat_19_tfidf</th>\n",
              "      <th>feat_20_tfidf</th>\n",
              "      <th>feat_21_tfidf</th>\n",
              "      <th>feat_22_tfidf</th>\n",
              "      <th>feat_23_tfidf</th>\n",
              "      <th>feat_24_tfidf</th>\n",
              "      <th>feat_25_tfidf</th>\n",
              "      <th>feat_26_tfidf</th>\n",
              "      <th>feat_27_tfidf</th>\n",
              "      <th>feat_28_tfidf</th>\n",
              "      <th>feat_29_tfidf</th>\n",
              "      <th>feat_30_tfidf</th>\n",
              "      <th>feat_31_tfidf</th>\n",
              "      <th>feat_32_tfidf</th>\n",
              "      <th>feat_33_tfidf</th>\n",
              "      <th>feat_34_tfidf</th>\n",
              "      <th>feat_35_tfidf</th>\n",
              "      <th>feat_36_tfidf</th>\n",
              "      <th>feat_37_tfidf</th>\n",
              "      <th>feat_38_tfidf</th>\n",
              "      <th>feat_39_tfidf</th>\n",
              "      <th>...</th>\n",
              "      <th>feat_55_tfidf</th>\n",
              "      <th>feat_56_tfidf</th>\n",
              "      <th>feat_57_tfidf</th>\n",
              "      <th>feat_58_tfidf</th>\n",
              "      <th>feat_59_tfidf</th>\n",
              "      <th>feat_60_tfidf</th>\n",
              "      <th>feat_61_tfidf</th>\n",
              "      <th>feat_62_tfidf</th>\n",
              "      <th>feat_63_tfidf</th>\n",
              "      <th>feat_64_tfidf</th>\n",
              "      <th>feat_65_tfidf</th>\n",
              "      <th>feat_66_tfidf</th>\n",
              "      <th>feat_67_tfidf</th>\n",
              "      <th>feat_68_tfidf</th>\n",
              "      <th>feat_69_tfidf</th>\n",
              "      <th>feat_70_tfidf</th>\n",
              "      <th>feat_71_tfidf</th>\n",
              "      <th>feat_72_tfidf</th>\n",
              "      <th>feat_73_tfidf</th>\n",
              "      <th>feat_74_tfidf</th>\n",
              "      <th>feat_75_tfidf</th>\n",
              "      <th>feat_76_tfidf</th>\n",
              "      <th>feat_77_tfidf</th>\n",
              "      <th>feat_78_tfidf</th>\n",
              "      <th>feat_79_tfidf</th>\n",
              "      <th>feat_80_tfidf</th>\n",
              "      <th>feat_81_tfidf</th>\n",
              "      <th>feat_82_tfidf</th>\n",
              "      <th>feat_83_tfidf</th>\n",
              "      <th>feat_84_tfidf</th>\n",
              "      <th>feat_85_tfidf</th>\n",
              "      <th>feat_86_tfidf</th>\n",
              "      <th>feat_87_tfidf</th>\n",
              "      <th>feat_88_tfidf</th>\n",
              "      <th>feat_89_tfidf</th>\n",
              "      <th>feat_90_tfidf</th>\n",
              "      <th>feat_91_tfidf</th>\n",
              "      <th>feat_92_tfidf</th>\n",
              "      <th>feat_93_tfidf</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>7898</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.682684</td>\n",
              "      <td>0.278016</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.180011</td>\n",
              "      <td>0.198141</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.256477</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.235994</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.264654</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11288</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.569271</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.110149</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.104231</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.154495</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.065596</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.073562</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>10356</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.197374</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.193312</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.270962</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.158432</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.633419</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.384295</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.322727</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>13439</td>\n",
              "      <td>0.128273</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.668426</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.066094</td>\n",
              "      <td>0.218253</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.098477</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.140497</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.204081</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.103180</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.086649</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.194345</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>54130</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.097954</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.057593</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.077535</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.061993</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.118274</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.129850</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.062872</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.88743</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.273109</td>\n",
              "      <td>0.265582</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.066991</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.064113</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>0.0</td>\n",
              "      <td>Class_8</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 95 columns</p>\n",
              "</div>"
            ],
            "text/plain": [
              "      id  feat_1_tfidf  feat_2_tfidf  ...  feat_92_tfidf  feat_93_tfidf   target\n",
              "0   7898      0.000000           0.0  ...            0.0            0.0  Class_2\n",
              "1  11288      0.000000           0.0  ...            0.0            0.0  Class_2\n",
              "2  10356      0.000000           0.0  ...            0.0            0.0  Class_2\n",
              "3  13439      0.128273           0.0  ...            0.0            0.0  Class_2\n",
              "4  54130      0.000000           0.0  ...            0.0            0.0  Class_8\n",
              "\n",
              "[5 rows x 95 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 23
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "PWg7lurlWi7r",
        "colab_type": "text"
      },
      "source": [
        "## 8. 对tfidf特征进行PCA降维，并观察方差在各个成分上的分布"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "Y9gs3yEDWqAV",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "from sklearn.decomposition import PCA\n",
        "pca_tfidf = PCA(n_components=0.85)\n",
        "pca_tfidf.fit(X_train_tfidf)\n",
        "X_train_tfidf_pca = pca_tfidf.transform(X_train_tfidf)"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "YTW2pFcCW_Nx",
        "colab_type": "code",
        "outputId": "5ded0664-d139-46bd-a653-3c4d124cacb9",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 265
        }
      },
      "source": [
        "import matplotlib.pyplot as plt\n",
        "%matplotlib inline\n",
        "\n",
        "plt.bar(range(len(pca_tfidf.explained_variance_ratio_)), pca_tfidf.explained_variance_ratio_)\n",
        "plt.show()"
      ],
      "execution_count": 25,
      "outputs": [
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD4CAYAAADiry33AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAPv0lEQVR4nO3df4xmV13H8ffHXVoQYgvtSHC3OEu6\nahZB1HWLEZW0oWwtdjFudQvqmtRUEzbBAMHFP0pZMaHGUEysiRu32oDaNvXXxC5uGkqCIVh2Wn7U\nbd0wlEK3VDptl2JN2rLl6x/PrTz7MO3c7jzz68z7lUz23nPPfeY8Jzufe+Y8955JVSFJatf3LXcD\nJEmLy6CXpMYZ9JLUOINekhpn0EtS49YvdwNGnX322TU5ObnczZCkVeWOO+54uKom5jq24oJ+cnKS\n6enp5W6GJK0qSb76bMecupGkxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMat\nuCdjF2py7y3fU3bfhy5ehpZI0srgiF6SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLU\nOINekhrXK+iTbE9yNMlMkr1zHP+FJHcmOZFk58ix3Um+1H3tHlfDJUn9zBv0SdYB1wIXAVuAy5Js\nGan2NeC3gb8bOfdlwPuB84BtwPuTvHThzZYk9dVnRL8NmKmqe6vqKeAGYMdwhaq6r6q+CHxn5Nw3\nA7dW1aNVdRy4Fdg+hnZLknrqE/QbgPuH9o91ZX30OjfJFUmmk0zPzs72fGlJUh8r4sPYqtpfVVur\nauvExMRyN0eSmtIn6B8Azhna39iV9bGQcyVJY9An6A8Dm5NsSnIasAuY6vn6h4ALk7y0+xD2wq5M\nkrRE5g36qjoB7GEQ0PcAN1XVkST7klwCkORnkhwDLgX+MsmR7txHgT9icLE4DOzryiRJS6TXX5iq\nqoPAwZGyK4e2DzOYlpnr3OuA6xbQRknSAqyID2MlSYvHoJekxhn0ktQ4g16SGmfQS1LjDHpJapxB\nL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNc6gl6TGGfSS\n1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJalyvoE+y\nPcnRJDNJ9s5x/PQkN3bHb08y2ZW/IMn1Se5Kck+S9423+ZKk+cwb9EnWAdcCFwFbgMuSbBmpdjlw\nvKrOBa4Bru7KLwVOr6rXAD8N/O4zFwFJ0tLoM6LfBsxU1b1V9RRwA7BjpM4O4Ppu+2bggiQBCnhx\nkvXAi4CngG+NpeWSpF76BP0G4P6h/WNd2Zx1quoE8BhwFoPQ/1/gQeBrwJ9W1aOj3yDJFUmmk0zP\nzs4+7zchSXp2i/1h7DbgaeCHgE3Au5O8arRSVe2vqq1VtXViYmKRmyRJa0ufoH8AOGdof2NXNmed\nbprmDOAR4G3Av1XVt6vqIeDTwNaFNlqS1F+foD8MbE6yKclpwC5gaqTOFLC7294J3FZVxWC65nyA\nJC8GXg/81zgaLknqZ96g7+bc9wCHgHuAm6rqSJJ9SS7pqh0AzkoyA7wLeOYWzGuBlyQ5wuCC8ddV\n9cVxvwlJ0rNb36dSVR0EDo6UXTm0/QSDWylHz3t8rnJJ0tLxyVhJapxBL0mNM+glqXEGvSQ1zqCX\npMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0kNa7XomYtmNx7y5zl933o4iVuiSQtLUf0\nktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9J\njTPoJalxBr0kNc6gl6TGGfSS1LheQZ9ke5KjSWaS7J3j+OlJbuyO355kcujYa5N8JsmRJHcleeH4\nmi9Jms+8QZ9kHXAtcBGwBbgsyZaRapcDx6vqXOAa4Oru3PXAx4Dfq6pXA28Evj221kuS5tVnRL8N\nmKmqe6vqKeAGYMdInR3A9d32zcAFSQJcCHyxqr4AUFWPVNXT42m6JKmPPkG/Abh/aP9YVzZnnao6\nATwGnAX8CFBJDiW5M8l75/oGSa5IMp1kenZ29vm+B0nSc1jsD2PXA28A3t79+ytJLhitVFX7q2pr\nVW2dmJhY5CZJ0tqyvkedB4BzhvY3dmVz1TnWzcufATzCYPT/qap6GCDJQeCngE8ssN1jNbn3ljnL\n7/vQxUvcEkkavz4j+sPA5iSbkpwG7AKmRupMAbu77Z3AbVVVwCHgNUm+v7sA/CJw93iaLknqY94R\nfVWdSLKHQWivA66rqiNJ9gHTVTUFHAA+mmQGeJTBxYCqOp7kwwwuFgUcrKq5h8+SpEXRZ+qGqjoI\nHBwpu3Jo+wng0mc592MMbrGUJC0Dn4yVpMYZ9JLUOINekhpn0EtS4wx6SWqcQS9JjTPoJalxBr0k\nNa7XA1Nr2Vzr4LgGjqTVxBG9JDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1\nzqCXpMYZ9JLUOINekhpn0EtS41y9cgFc2VLSauCIXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDWu\nV9An2Z7kaJKZJHvnOH56khu747cnmRw5/sokjyd5z3iaLUnqa96gT7IOuBa4CNgCXJZky0i1y4Hj\nVXUucA1w9cjxDwMfX3hzJUnPV58R/TZgpqruraqngBuAHSN1dgDXd9s3AxckCUCStwJfAY6Mp8mS\npOejT9BvAO4f2j/Wlc1Zp6pOAI8BZyV5CfAHwAee6xskuSLJdJLp2dnZvm2XJPWw2B/GXgVcU1WP\nP1elqtpfVVurauvExMQiN0mS1pY+a908AJwztL+xK5urzrEk64EzgEeA84CdSf4EOBP4TpInqurP\nF9xySVIvfYL+MLA5ySYGgb4LeNtInSlgN/AZYCdwW1UV8PPPVEhyFfC4IS9JS2veoK+qE0n2AIeA\ndcB1VXUkyT5guqqmgAPAR5PMAI8yuBhIklaAXssUV9VB4OBI2ZVD208Al87zGledQvtWLZcwlrRS\n+GSsJDXOoJekxhn0ktQ4g16SGmfQS1LjDHpJapxBL0mN63UfvcZnrvvrwXvsJS0eR/SS1DiDXpIa\nZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXO++hXEO+xl7QYHNFLUuMc0a8S/sUqSafKEb0kNc6gl6TG\nGfSS1DiDXpIaZ9BLUuMMeklqnEEvSY0z6CWpcQa9JDXOJ2Mb4FOzkp6LI3pJapxBL0mN6xX0SbYn\nOZpkJsneOY6fnuTG7vjtSSa78jcluSPJXd2/54+3+ZKk+cw7R59kHXAt8CbgGHA4yVRV3T1U7XLg\neFWdm2QXcDXw68DDwC9X1deT/DhwCNgw7jehubm+vSToN6LfBsxU1b1V9RRwA7BjpM4O4Ppu+2bg\ngiSpqs9V1de78iPAi5KcPo6GS5L66RP0G4D7h/aP8b2j8v+vU1UngMeAs0bq/CpwZ1U9OfoNklyR\nZDrJ9OzsbN+2S5J6WJLbK5O8msF0zoVzHa+q/cB+gK1bt9ZStGmtc1pHWjv6jOgfAM4Z2t/Ylc1Z\nJ8l64AzgkW5/I/BPwG9V1ZcX2mBJ0vPTZ0R/GNicZBODQN8FvG2kzhSwG/gMsBO4raoqyZnALcDe\nqvr0+JqtxfRco30fzpJWn3lH9N2c+x4Gd8zcA9xUVUeS7EtySVftAHBWkhngXcAzt2DuAc4Frkzy\n+e7rB8f+LiRJz6rXHH1VHQQOjpRdObT9BHDpHOd9EPjgAtsoSVoAn4yVpMa5qJnGxvl7aWVyRC9J\njTPoJalxBr0kNc45ei06n8KVlpcjeklqnCN6LStH+9LiM+i1Ynm7pjQeBr1Wpee6CHiBkE5m0GvN\ncJpIa5VBL+FFQG0z6KV5OE2k1c6glxaJFwGtFAa9tMSe7x92eeaYdKoMemmVcApJp8qglxr3bBcB\nf3tYOwx6Sd/jVP9usL9ZrEwGvaQl4W8Wy8egl7RinepvFjqZQS+pOacyvXSqd0OthguOQS9Ji2Sl\nXAQMeklaYkv9uYR/eESSGmfQS1LjDHpJapxBL0mNM+glqXEGvSQ1zqCXpMb1Cvok25McTTKTZO8c\nx09PcmN3/PYkk0PH3teVH03y5vE1XZLUx7xBn2QdcC1wEbAFuCzJlpFqlwPHq+pc4Brg6u7cLcAu\n4NXAduAvuteTJC2RPiP6bcBMVd1bVU8BNwA7RursAK7vtm8GLkiSrvyGqnqyqr4CzHSvJ0laIqmq\n566Q7AS2V9XvdPu/CZxXVXuG6vxnV+dYt/9l4DzgKuA/qupjXfkB4ONVdfPI97gCuKLb/VHg6MLf\nGmcDD4/hdVphf5zM/jiZ/XGy1dgfP1xVE3MdWBFr3VTVfmD/OF8zyXRVbR3na65m9sfJ7I+T2R8n\na60/+kzdPACcM7S/sSubs06S9cAZwCM9z5UkLaI+QX8Y2JxkU5LTGHy4OjVSZwrY3W3vBG6rwZzQ\nFLCruytnE7AZ+Ox4mi5J6mPeqZuqOpFkD3AIWAdcV1VHkuwDpqtqCjgAfDTJDPAog4sBXb2bgLuB\nE8A7qurpRXovo8Y6FdQA++Nk9sfJ7I+TNdUf834YK0la3XwyVpIaZ9BLUuOaDPr5lmxoXZLrkjzU\nPd/wTNnLktya5Evdvy9dzjYupSTnJPlkkruTHEnyzq58TfZJkhcm+WySL3T98YGufFO3hMlMt6TJ\nacvd1qWSZF2SzyX5126/qb5oLuh7LtnQur9hsOTEsL3AJ6pqM/CJbn+tOAG8u6q2AK8H3tH9n1ir\nffIkcH5V/QTwOmB7ktczWLrkmm4pk+MMljZZK94J3DO031RfNBf09FuyoWlV9SkGdz8NG16m4nrg\nrUvaqGVUVQ9W1Z3d9v8w+IHewBrtkxp4vNt9QfdVwPkMljCBNdQfSTYCFwN/1e2HxvqixaDfANw/\ntH+sK1vrXl5VD3bb/w28fDkbs1y6lVV/EridNdwn3VTF54GHgFuBLwPfrKoTXZW19HPzEeC9wHe6\n/bNorC9aDHrNo3uYbc3dV5vkJcA/AL9fVd8aPrbW+qSqnq6q1zF4Wn0b8GPL3KRlkeQtwENVdcdy\nt2UxrYi1bsbMZRfm9o0kr6iqB5O8gsFIbs1I8gIGIf+3VfWPXfGa7hOAqvpmkk8CPwucmWR9N5Jd\nKz83PwdckuSXgBcCPwD8GY31RYsj+j5LNqxFw8tU7Ab+ZRnbsqS6OdcDwD1V9eGhQ2uyT5JMJDmz\n234R8CYGn1t8ksESJrBG+qOq3ldVG6tqkkFW3FZVb6exvmjyydju6vwRvrtkwx8vc5OWVJK/B97I\nYKnVbwDvB/4ZuAl4JfBV4NeqavQD2yYleQPw78BdfHce9g8ZzNOvuT5J8loGHzCuYzDYu6mq9iV5\nFYObF14GfA74jap6cvlaurSSvBF4T1W9pbW+aDLoJUnf1eLUjSRpiEEvSY0z6CWpcQa9JDXOoJek\nxhn0ktQ4g16SGvd/DRa0hkIW+PsAAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "TeyfjZALXUbZ",
        "colab_type": "text"
      },
      "source": [
        "**可以看到经过tfidf后的特征，再进行PCA降维，方差分布进一步向头部的主成分集中**"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "7Xys9lAI0Ti8",
        "colab_type": "code",
        "outputId": "d9bd2e50-ab98-4710-e71f-16c3f2428005",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 224
        }
      },
      "source": [
        "n_components = pca_tfidf.n_components_\n",
        "tfidf_pca_feature_names = list()\n",
        "for i in range(n_components):\n",
        "  tfidf_pca_feature_names.append(\"pca_{}\".format(i))\n",
        "\n",
        "X_train_tfidf_pca_df = pd.DataFrame(data=X_train_tfidf_pca, columns=tfidf_pca_feature_names)\n",
        "\n",
        "tfidf_pca_data = pd.concat([id, X_train_tfidf_pca_df, target], axis=1)\n",
        "tfidf_pca_data_path = \"./data/Otto_FE_train_tfidf_pca.csv\"\n",
        "tfidf_pca_data.to_csv(tfidf_pca_data_path, index=False, header=True)\n",
        "\n",
        "tfidf_pca_data.head()"
      ],
      "execution_count": 27,
      "outputs": [
        {
          "output_type": "execute_result",
          "data": {
            "text/html": [
              "<div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>id</th>\n",
              "      <th>pca_0</th>\n",
              "      <th>pca_1</th>\n",
              "      <th>pca_2</th>\n",
              "      <th>pca_3</th>\n",
              "      <th>pca_4</th>\n",
              "      <th>pca_5</th>\n",
              "      <th>pca_6</th>\n",
              "      <th>pca_7</th>\n",
              "      <th>pca_8</th>\n",
              "      <th>pca_9</th>\n",
              "      <th>pca_10</th>\n",
              "      <th>pca_11</th>\n",
              "      <th>pca_12</th>\n",
              "      <th>pca_13</th>\n",
              "      <th>pca_14</th>\n",
              "      <th>pca_15</th>\n",
              "      <th>pca_16</th>\n",
              "      <th>pca_17</th>\n",
              "      <th>pca_18</th>\n",
              "      <th>pca_19</th>\n",
              "      <th>pca_20</th>\n",
              "      <th>pca_21</th>\n",
              "      <th>pca_22</th>\n",
              "      <th>pca_23</th>\n",
              "      <th>pca_24</th>\n",
              "      <th>pca_25</th>\n",
              "      <th>pca_26</th>\n",
              "      <th>pca_27</th>\n",
              "      <th>pca_28</th>\n",
              "      <th>pca_29</th>\n",
              "      <th>pca_30</th>\n",
              "      <th>pca_31</th>\n",
              "      <th>pca_32</th>\n",
              "      <th>pca_33</th>\n",
              "      <th>pca_34</th>\n",
              "      <th>pca_35</th>\n",
              "      <th>pca_36</th>\n",
              "      <th>pca_37</th>\n",
              "      <th>pca_38</th>\n",
              "      <th>pca_39</th>\n",
              "      <th>pca_40</th>\n",
              "      <th>pca_41</th>\n",
              "      <th>pca_42</th>\n",
              "      <th>pca_43</th>\n",
              "      <th>pca_44</th>\n",
              "      <th>target</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>7898</td>\n",
              "      <td>0.667875</td>\n",
              "      <td>0.057747</td>\n",
              "      <td>0.050709</td>\n",
              "      <td>-0.147997</td>\n",
              "      <td>0.054880</td>\n",
              "      <td>0.091181</td>\n",
              "      <td>0.125951</td>\n",
              "      <td>0.011606</td>\n",
              "      <td>-0.137475</td>\n",
              "      <td>-0.009323</td>\n",
              "      <td>-0.065140</td>\n",
              "      <td>-0.151212</td>\n",
              "      <td>0.112089</td>\n",
              "      <td>-0.119152</td>\n",
              "      <td>-0.026706</td>\n",
              "      <td>-0.000103</td>\n",
              "      <td>-0.184158</td>\n",
              "      <td>-0.046322</td>\n",
              "      <td>-0.074406</td>\n",
              "      <td>0.031657</td>\n",
              "      <td>0.001915</td>\n",
              "      <td>0.025119</td>\n",
              "      <td>0.019071</td>\n",
              "      <td>-0.049231</td>\n",
              "      <td>0.052767</td>\n",
              "      <td>0.046975</td>\n",
              "      <td>-0.049890</td>\n",
              "      <td>0.052416</td>\n",
              "      <td>0.038230</td>\n",
              "      <td>-0.068870</td>\n",
              "      <td>0.013143</td>\n",
              "      <td>-0.070346</td>\n",
              "      <td>-0.099009</td>\n",
              "      <td>0.083838</td>\n",
              "      <td>-0.101378</td>\n",
              "      <td>-0.048706</td>\n",
              "      <td>-0.055945</td>\n",
              "      <td>-0.027905</td>\n",
              "      <td>-0.004466</td>\n",
              "      <td>0.061262</td>\n",
              "      <td>0.045468</td>\n",
              "      <td>-0.006921</td>\n",
              "      <td>-0.022299</td>\n",
              "      <td>0.026603</td>\n",
              "      <td>-0.027396</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>11288</td>\n",
              "      <td>0.429502</td>\n",
              "      <td>-0.023209</td>\n",
              "      <td>-0.247563</td>\n",
              "      <td>-0.004651</td>\n",
              "      <td>0.011031</td>\n",
              "      <td>-0.088414</td>\n",
              "      <td>-0.021458</td>\n",
              "      <td>-0.332015</td>\n",
              "      <td>-0.289486</td>\n",
              "      <td>-0.126267</td>\n",
              "      <td>0.101066</td>\n",
              "      <td>0.029950</td>\n",
              "      <td>-0.149777</td>\n",
              "      <td>0.056233</td>\n",
              "      <td>0.230039</td>\n",
              "      <td>-0.033686</td>\n",
              "      <td>0.248555</td>\n",
              "      <td>-0.037454</td>\n",
              "      <td>0.095499</td>\n",
              "      <td>-0.276584</td>\n",
              "      <td>-0.219848</td>\n",
              "      <td>0.085154</td>\n",
              "      <td>-0.078203</td>\n",
              "      <td>0.005667</td>\n",
              "      <td>0.030845</td>\n",
              "      <td>0.020061</td>\n",
              "      <td>-0.035965</td>\n",
              "      <td>0.074397</td>\n",
              "      <td>-0.091297</td>\n",
              "      <td>0.029148</td>\n",
              "      <td>0.010200</td>\n",
              "      <td>0.000959</td>\n",
              "      <td>-0.026041</td>\n",
              "      <td>-0.026402</td>\n",
              "      <td>-0.015314</td>\n",
              "      <td>-0.040211</td>\n",
              "      <td>-0.002492</td>\n",
              "      <td>-0.014329</td>\n",
              "      <td>-0.031991</td>\n",
              "      <td>-0.039471</td>\n",
              "      <td>-0.004291</td>\n",
              "      <td>0.004059</td>\n",
              "      <td>0.023400</td>\n",
              "      <td>-0.083766</td>\n",
              "      <td>0.038580</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>10356</td>\n",
              "      <td>0.105192</td>\n",
              "      <td>-0.245815</td>\n",
              "      <td>-0.001409</td>\n",
              "      <td>0.074403</td>\n",
              "      <td>-0.150605</td>\n",
              "      <td>-0.212282</td>\n",
              "      <td>-0.236170</td>\n",
              "      <td>-0.405033</td>\n",
              "      <td>0.097142</td>\n",
              "      <td>-0.071967</td>\n",
              "      <td>-0.152243</td>\n",
              "      <td>0.154262</td>\n",
              "      <td>-0.204578</td>\n",
              "      <td>0.030658</td>\n",
              "      <td>0.075497</td>\n",
              "      <td>0.041007</td>\n",
              "      <td>0.147396</td>\n",
              "      <td>-0.050048</td>\n",
              "      <td>0.166750</td>\n",
              "      <td>0.101393</td>\n",
              "      <td>-0.010664</td>\n",
              "      <td>0.027279</td>\n",
              "      <td>-0.136182</td>\n",
              "      <td>0.103666</td>\n",
              "      <td>-0.048154</td>\n",
              "      <td>-0.125048</td>\n",
              "      <td>0.040598</td>\n",
              "      <td>-0.063761</td>\n",
              "      <td>0.044946</td>\n",
              "      <td>0.098753</td>\n",
              "      <td>0.122823</td>\n",
              "      <td>0.174589</td>\n",
              "      <td>0.076438</td>\n",
              "      <td>-0.004225</td>\n",
              "      <td>-0.014557</td>\n",
              "      <td>-0.097351</td>\n",
              "      <td>0.023938</td>\n",
              "      <td>-0.067994</td>\n",
              "      <td>-0.074638</td>\n",
              "      <td>0.029473</td>\n",
              "      <td>0.039766</td>\n",
              "      <td>-0.008185</td>\n",
              "      <td>-0.010258</td>\n",
              "      <td>-0.120268</td>\n",
              "      <td>0.005394</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>13439</td>\n",
              "      <td>0.573379</td>\n",
              "      <td>0.019872</td>\n",
              "      <td>-0.202900</td>\n",
              "      <td>-0.047554</td>\n",
              "      <td>-0.008548</td>\n",
              "      <td>-0.078273</td>\n",
              "      <td>-0.007455</td>\n",
              "      <td>-0.211374</td>\n",
              "      <td>-0.229601</td>\n",
              "      <td>-0.109178</td>\n",
              "      <td>0.086891</td>\n",
              "      <td>0.022025</td>\n",
              "      <td>-0.103540</td>\n",
              "      <td>-0.002404</td>\n",
              "      <td>0.035577</td>\n",
              "      <td>-0.074864</td>\n",
              "      <td>0.170792</td>\n",
              "      <td>0.039078</td>\n",
              "      <td>0.091032</td>\n",
              "      <td>-0.093451</td>\n",
              "      <td>-0.016984</td>\n",
              "      <td>0.055547</td>\n",
              "      <td>0.018713</td>\n",
              "      <td>0.000073</td>\n",
              "      <td>-0.048193</td>\n",
              "      <td>0.199426</td>\n",
              "      <td>-0.072457</td>\n",
              "      <td>-0.055759</td>\n",
              "      <td>0.038095</td>\n",
              "      <td>-0.014810</td>\n",
              "      <td>0.076690</td>\n",
              "      <td>0.016317</td>\n",
              "      <td>0.014692</td>\n",
              "      <td>0.013328</td>\n",
              "      <td>-0.092075</td>\n",
              "      <td>-0.062274</td>\n",
              "      <td>0.060006</td>\n",
              "      <td>-0.049708</td>\n",
              "      <td>0.010847</td>\n",
              "      <td>-0.023870</td>\n",
              "      <td>-0.110555</td>\n",
              "      <td>-0.006765</td>\n",
              "      <td>0.050134</td>\n",
              "      <td>-0.061041</td>\n",
              "      <td>-0.085273</td>\n",
              "      <td>Class_2</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>54130</td>\n",
              "      <td>-0.163251</td>\n",
              "      <td>-0.099113</td>\n",
              "      <td>-0.070745</td>\n",
              "      <td>0.033169</td>\n",
              "      <td>-0.021083</td>\n",
              "      <td>-0.229392</td>\n",
              "      <td>0.080799</td>\n",
              "      <td>0.189776</td>\n",
              "      <td>-0.166591</td>\n",
              "      <td>0.203352</td>\n",
              "      <td>-0.017575</td>\n",
              "      <td>-0.013158</td>\n",
              "      <td>-0.055050</td>\n",
              "      <td>0.040532</td>\n",
              "      <td>-0.088878</td>\n",
              "      <td>0.066811</td>\n",
              "      <td>0.003510</td>\n",
              "      <td>0.024627</td>\n",
              "      <td>0.059367</td>\n",
              "      <td>-0.202620</td>\n",
              "      <td>-0.011173</td>\n",
              "      <td>-0.077038</td>\n",
              "      <td>-0.200964</td>\n",
              "      <td>-0.013388</td>\n",
              "      <td>0.120761</td>\n",
              "      <td>0.058965</td>\n",
              "      <td>-0.000158</td>\n",
              "      <td>-0.025454</td>\n",
              "      <td>0.076088</td>\n",
              "      <td>0.060053</td>\n",
              "      <td>0.087499</td>\n",
              "      <td>0.017404</td>\n",
              "      <td>0.070225</td>\n",
              "      <td>0.078440</td>\n",
              "      <td>0.497743</td>\n",
              "      <td>-0.276735</td>\n",
              "      <td>-0.005181</td>\n",
              "      <td>-0.254792</td>\n",
              "      <td>0.377312</td>\n",
              "      <td>0.174316</td>\n",
              "      <td>0.035229</td>\n",
              "      <td>0.207289</td>\n",
              "      <td>-0.018681</td>\n",
              "      <td>0.150939</td>\n",
              "      <td>0.057525</td>\n",
              "      <td>Class_8</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>"
            ],
            "text/plain": [
              "      id     pca_0     pca_1     pca_2  ...    pca_42    pca_43    pca_44   target\n",
              "0   7898  0.667875  0.057747  0.050709  ... -0.022299  0.026603 -0.027396  Class_2\n",
              "1  11288  0.429502 -0.023209 -0.247563  ...  0.023400 -0.083766  0.038580  Class_2\n",
              "2  10356  0.105192 -0.245815 -0.001409  ... -0.010258 -0.120268  0.005394  Class_2\n",
              "3  13439  0.573379  0.019872 -0.202900  ...  0.050134 -0.061041 -0.085273  Class_2\n",
              "4  54130 -0.163251 -0.099113 -0.070745  ... -0.018681  0.150939  0.057525  Class_8\n",
              "\n",
              "[5 rows x 47 columns]"
            ]
          },
          "metadata": {
            "tags": []
          },
          "execution_count": 27
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "_DDLqjOSXg-g",
        "colab_type": "text"
      },
      "source": [
        "## 9. 分别在不同的情况下，训练RBF核的SVM，然后对比效果\n",
        "+ 原始数据进行MinMaxScale，再训练RBF核的SVM\n",
        "+ 原始数据进行PCA降维，然后进行MinMaxScale, 再训练RBF核的SVM\n",
        "+ 原始数据进行tfidf处理后，然后进行MinMaxScale处理, 再训练RBF核的SVM\n",
        "+ 原始数据进行tfidf处理后，然后进行PCA降维，再进行MinMaxScale处理，接着训练RBF核的SVM\n",
        "\n",
        "首先，我们定义RBF核SVM的搜索流程类"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "GQMoGs_Y3Qqs",
        "colab_type": "code",
        "colab": {}
      },
      "source": [
        "import pandas as pd\n",
        "from sklearn.preprocessing import MinMaxScaler\n",
        "from sklearn.model_selection import train_test_split\n",
        "from sklearn.svm import SVC\n",
        "import numpy as np\n",
        "import _pickle as cPickle\n",
        "from scipy.sparse import csr_matrix\n",
        "\n",
        "\n",
        "class TrainValData:\n",
        "  \"\"\"\n",
        "  训练集和校验集\n",
        "  \"\"\"\n",
        "  def __init__(self, data_file):\n",
        "    data = pd.read_csv(data_file)\n",
        "\n",
        "    self.y_train_ = data[\"target\"]\n",
        "    self.X_train_ = data.drop([\"id\", \"target\"], axis=1)\n",
        "    self.X_train_ = MinMaxScaler().fit_transform(self.X_train_)\n",
        "    self.X_train_ = csr_matrix(self.X_train_)\n",
        "\n",
        "    self.X_train_part_, self.X_val_part_, self.y_train_part_, self.y_val_part_ = train_test_split(self.X_train_, self.y_train_, train_size=0.8, random_state=0)\n",
        "\n",
        "\n",
        "class SVMSearch:\n",
        "  \"\"\"\n",
        "  带RBF核的SVM超参数搜索\n",
        "  \"\"\"\n",
        "  def __init__(self, data):\n",
        "    self.data_ = data\n",
        "\n",
        "    self.C_list_ = np.logspace(-1, 3, 5).tolist()\n",
        "    self.gamma_list_ = np.logspace(-1, 1, 3).tolist()\n",
        "    self.accuracy_list_ = np.zeros(shape=(len(self.C_list_), len(self.gamma_list_)), dtype=float).tolist() \n",
        "\n",
        "    self.best_C_ = None\n",
        "    self.best_gamma_ = None\n",
        "    self.best_accuracy_ = None\n",
        "\n",
        "  def search(self):\n",
        "    \"\"\"\n",
        "    对C和gamma在指定范围内进行超参数搜索\n",
        "    \"\"\"\n",
        "    for i, each_C in enumerate(self.C_list_):\n",
        "      for j, each_gamma in enumerate(self.gamma_list_):\n",
        "        self.accuracy_list_[i][j] = self.single_try(each_C, each_gamma)\n",
        "\n",
        "  def single_try(self, C, gamma):\n",
        "    \"\"\"\n",
        "    尝试一次超参数组合, 返回在校验集上的accuracy\n",
        "    :param C: 正则项系数\n",
        "    :param gamma: rbf核函数宽度\n",
        "    :return: 当前C和gamma组合训练的模型，在校验集上的accuracy\n",
        "    \"\"\"\n",
        "    print(\"start trying: C={}, gamma={}\".format(C, gamma))\n",
        "\n",
        "    each_svc = SVC(C=C, kernel=\"rbf\", gamma=gamma)\n",
        "    each_svc.fit(self.data_.X_train_part_, self.data_.y_train_part_)\n",
        "    accuracy = each_svc.score(self.data_.X_val_part_, self.data_.y_val_part_)\n",
        "\n",
        "    print(\"accuracy={}\\n\".format(accuracy))\n",
        "\n",
        "    return accuracy\n",
        "\n",
        "  def draw(self):\n",
        "    accuracy_T = np.array(self.accuracy_list_).T\n",
        "    x_axis = np.log10(self.C_list_)\n",
        "    colors = [\"b-\", \"y-\", \"r-\"]\n",
        "    for i, each_gamma in enumerate(self.gamma_list_):\n",
        "        plt.plot(x_axis, accuracy_T[i], colors[i], label=\"gamma={}\".format(each_gamma))\n",
        "    plt.legend()\n",
        "    plt.xlabel(\"log10(C)\")\n",
        "    plt.ylabel(\"accuracy\")\n",
        "\n",
        "  def get_best_params(self):\n",
        "    \"\"\"\n",
        "      找到最佳超参数组合\n",
        "    \"\"\"\n",
        "    accuracy_array = np.array(self.accuracy_list_)\n",
        "    C_index, gamma_index = np.unravel_index(np.argmax(accuracy_array, axis=None), accuracy_array.shape)\n",
        "    self.best_C_, self.best_gamma_, self.best_accuracy_ = self.C_list_[C_index], self.gamma_list_[gamma_index], accuracy_array[C_index][gamma_index]\n",
        "    print(\"best_C_={}, best_gamma_={}, best_accuracy_={}\".format(self.best_C_, self.best_gamma_, self.best_accuracy_))\n",
        "\n",
        "  def save_best_model(self, model_path):\n",
        "    \"\"\"\n",
        "      用最佳超参数组合在整个训练集上构建模型, 并持久化\n",
        "    \"\"\"\n",
        "    print(\"start fitting with best params...\")\n",
        "    best_svc = SVC(C=self.best_C_, kernel=\"rbf\", gamma=self.best_gamma_, probability=True)\n",
        "    best_svc.fit(self.data_.X_train_, self.data_.y_train_)\n",
        "\n",
        "    print(\"end fitting with best params.\")\n",
        "\n",
        "    cPickle.dump(best_svc, open(model_path, \"wb\"))"
      ],
      "execution_count": 0,
      "outputs": []
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "LOXh1BOZnq2g",
        "colab_type": "text"
      },
      "source": [
        "### 9.1 原始数据进行MinMaxScale，再训练RBF核的SVM"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "H5BJXo5EntY2",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "outputId": "15f7cb90-6fc1-4968-f327-c1f86f5c4531"
      },
      "source": [
        "train_val_data_org = TrainValData(org_data_path) # 解析原始数据，并切割成训练集和校验集\n",
        "search_org = SVMSearch(train_val_data_org)    # 创建超参数搜索对象\n",
        "search_org.search()                 # 超参数C和gamma搜索\n",
        "search_org.draw()                  # 画出不同超参数组合下的accuracy曲线\n",
        "search_org.get_best_params()            # 找到最佳参数组合\n",
        "search_org.save_best_model(\"./model/Otto_org_rbf_svc.pkl\") # 保存模型\n"
      ],
      "execution_count": 29,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "start trying: C=0.1, gamma=0.1\n",
            "accuracy=0.5085\n",
            "\n",
            "start trying: C=0.1, gamma=1.0\n",
            "accuracy=0.689\n",
            "\n",
            "start trying: C=0.1, gamma=10.0\n",
            "accuracy=0.653\n",
            "\n",
            "start trying: C=1.0, gamma=0.1\n",
            "accuracy=0.6925\n",
            "\n",
            "start trying: C=1.0, gamma=1.0\n",
            "accuracy=0.747\n",
            "\n",
            "start trying: C=1.0, gamma=10.0\n",
            "accuracy=0.7525\n",
            "\n",
            "start trying: C=10.0, gamma=0.1\n",
            "accuracy=0.738\n",
            "\n",
            "start trying: C=10.0, gamma=1.0\n",
            "accuracy=0.765\n",
            "\n",
            "start trying: C=10.0, gamma=10.0\n",
            "accuracy=0.7605\n",
            "\n",
            "start trying: C=100.0, gamma=0.1\n",
            "accuracy=0.7625\n",
            "\n",
            "start trying: C=100.0, gamma=1.0\n",
            "accuracy=0.766\n",
            "\n",
            "start trying: C=100.0, gamma=10.0\n",
            "accuracy=0.735\n",
            "\n",
            "start trying: C=1000.0, gamma=0.1\n",
            "accuracy=0.772\n",
            "\n",
            "start trying: C=1000.0, gamma=1.0\n",
            "accuracy=0.7455\n",
            "\n",
            "start trying: C=1000.0, gamma=10.0\n",
            "accuracy=0.7185\n",
            "\n",
            "best_C_=1000.0, best_gamma_=0.1, best_accuracy_=0.772\n",
            "start fitting with best params...\n",
            "end fitting with best params.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEGCAYAAAB/+QKOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXhU5fXA8e9JCAlJkH0JCXtQWcKu\noAiCSkGroNZWbatoRfqz4t66tIgK1l1Ei1gRrbhbN0SrCCpoXbAEEZF9kSURBBKWLCRkeX9/vHfI\nJEzCJMydm0zO53nmySx3Zk4Gcs+823nFGINSSilVUZTXASillKqdNEEopZQKSBOEUkqpgDRBKKWU\nCkgThFJKqYAaeB1AqLRs2dJ06tTJ6zCUUqpOWbZs2R5jTKtAj0VMgujUqRPp6eleh6GUUnWKiGyt\n7DHtYlJKKRWQJgillFIBaYJQSikVkCYIpZRSAWmCUEopFZAmCKWUUgFpglBKKRVQxKyDUEqpSFdS\nAnv2wM6d5S9Nm8If/xj699MEoZRSHjIGDhw48qQf6LJrF5SWHvkagwdrglBKqTqjoAB+/jm4E39B\nwZHPb9AA2rSBtm0hORkGDLDXA10SE935HTRBKKVUkCrr4gl02bcv8Gu0bFl2Yj/ttCNP9r6k0Lw5\nRHk8SqwJQilVqxhjEJEwvt+xd/EkJpad4Hv1grPOCvxNv3VriIkJ2692zDRBKOWikpJ88vJWU1Ky\nH2NKMKYUKKlwvRRjSoDyP6v/eChfq+LjoXytqj8DMMTEtKZx4wE0bjzw8CU2tl21PvtQdPH4Tuwp\nKTBwYOCTfps27nXxeE0ThFIhYEwpBw9uJi/ve/LyVpKba38ePLgRMGGKQoAoRKIRiQKiK1y3j/mO\nOfLYyh4vf12kYYheq7JjhcLCbeTkpJOd/RE2aUDDhm1JTBxIVNRADh4cSFbWAHbubMvOnYETQU27\neHyXZs287+LxmiYIparp0KE95OWtJC/ve3JzVzpJYRWlpfnOEUKjRqkkJKTRps3vSEhIIyamVTVO\n1DU9qYevW8Yte/fCV1/Bhg32JL9nTx7GrCAhIZ2WLdPp2DGd9u3/Q1SUIS4OEhKSKS4eSE7OQAoL\nBxAbO5C0tFaMHHlkn35d7OLxmiYIpSpRUlJAfv6aci2CvLzvOXRo5+FjYmJakpDQm6Skq0lM7E1C\nQhoJCT2Ijk7wMPK6Y+dO+O9/4fPP7WXlSjsmAPZE3rZtAm3bnkrbtqdSUADFxXDwYA7t2n1Hs2a+\nhLGMgoJ3D79mbGwHp1tqwOGfMTEtPPoN6zYxJlzNX3cNHDjQ6IZBqiaMMRQUbD2ieyg/fz1QAoBI\nLAkJPUhI6E1iYhoJjXqRUNSBhgeikKwsO7Vl9277M9ClQwcYPtxeTj4ZYmO9/JU9YQxs3WoTgS8p\nrF9vH4uPh1NPhWHDYOhQSEuzs3iCbRQVF+8nJ2c5OTnp5OYuIycn3enes+LiOpcb00hM7E9MTDMX\nfsu6R0SWGWMGBnxME4SqT4qK9lXoHlpJXu73mLxcYvZDw/0Qn9+GxIJ2xOe3JC43kYb7G9BgX3FZ\nItizB7Ky7JzHQGJjoVUr29ndsqU9061fDytW2LNko0b2bDh8OIwYASedBA0bhvVzCAdjYN26stbB\n55/D9u32saZNbf//sGH20r9/6Lt+ior2kpv7LTk5NmHk5KRTUPDj4cfj4rr6DYIPoHHj/jRo0CS0\nQdQBmiBU/XLoEKW7d1CQkU5BxrcU7VhN8Y4NlO7ajmQdIGY/xByAhvujaXggmgb7SogqrORkHxVV\ndqL3v/gngIr3xccH/uqbnW3PkosX28uKFfb+Ro1gyBCbLIYPtwmjDnaUl5TA99+XJYP//tc2qsCO\nA/iSwbBhdiqoFwPARUVZ5OR8ezhh5OSkU1i47fDjjRodX657KjGxHw0aNA5/oGGkCULVXaWldjpK\nJV04ZvduSndlULorA/bsISr7ANE5RZW/3HFxmBbNkFZJSOt2yNFO+k2buncmy8oqSxiLFtkOeLAJ\n5rTTyloYAwbUyoRx6BCkp5clgy++sOsJADp1Kp8QUlOD7y4Kt0OHdpdrZeTmLqOwMMN5VIiPP7FC\n91TfiBpj0gShagdjIC/vyP75QH33vvuysgKvTAJKY6M41MRQ1MRQ1ASKmkBp80SkVRLRbToTk3Qi\nsSl9iE3uR1TrJNvVU5u7cvbssWfbRYts0vjhB3t/QoJNGL4WxoABdpJ+mOXlwZIlZeMHS5bAwYP2\nse7dy5LB0KHQvn3YwwupwsKdh8cybPJY6jc5IYr4+O7l1mgkJvYhOrqRpzHXlCYI5Y5Dh+wJ/Ggn\nef9LoBVJANHRh7+1m5YtKGkaR1GTUgqOK+Bg/D7y4n8mv9FumwiaQmnzBBq16F02aJzQm4SEXpE1\n8Lh7N3z2WVmX1KpV9v7ERHsW9rUw+vVzJWHs2wdfflnWZZSebmcRRUVB3742hGHDbO5q3Trkb1/r\nFBb+VK6lkZOTTlHRLufRaBISepbrnkpI6E10dJynMQfDswQhIqOBx4FoYLYx5oEKjz8GjHBuxgOt\njTFNncdKAKfNzTZjzJiq3ksTRJjs2AEXX2z7z339CYE0bXrUvnrTogVFTSCv0U5yozeRm7/KmUm0\nGmMKnReKIj7+BBIS0vymkaYRF9fRWQtQj+zaZROGr4WxZo29v3Fje7b2tTD69q1Rwvj5Z9s68LUQ\nfGPqMTF2WMTXQjj1VGhS/8Zyj2CMobAw44ikUVycBYBIAxIS0iokjTSiompXK9aTBCF29c56YCSQ\nASwFLjXGrK7k+OuAfsaYPzi3c40xQS9g1wQRBhkZcMYZNklceaX92hioD7958yP6zG3JiVVHrCko\nKtpz+JiGDds6LYG0w62C+PjudeJbmCd+/rmsdbF4Maxda+8/7jh7JvdNq+3b17bQKti6tfwahHXr\n7P2+SVa+hHDyyXZYRB2dTRrb/BLGMidp7AVApCGJib1JTCwb00hI6ElUlHdjTF4liFOAu40xo5zb\ndwAYY+6v5PivgLuMMQud25ogapNt2+w31N27Yf58ewYJIJiSE1FRjUhI6HVEq6Bhw1Zh/IUi0I4d\nZV1SixaVLTJo0gQzbBi7ewzni5gRzN3cm8++iGbbtsMPH+4uGjrUTjmtzUM1dY1dZ/NjuYSRk7OM\nkpL9gF1jk5jYp9yU2/j4HkRFhWecyasEcREw2hgz3rl9GTDIGDMxwLEdgSVAirEVuxCRYuA7oBh4\nwBgzN8DzJgATADp06DBg69atrvwu9d6WLTY57N0LH30EgwYBwZac6OrXKrA/GzXq4pSHUG4pKYE1\nn/xExsufEf35IrpsX0zXkg0A7JOmbGg7jIODR9Dm4uGkXtib6Jh61l3nMftFatMRs6dKSnIB+yUq\nMbFvue6p+PgTXfm7qQsJ4jZscrjO775kY0ymiHQBPgXONMZsquz9tAXhkk2bbHLIzYUFC8jr3ogt\nW+5m//4vOXRox+HDGjRo4SQA/0FjLTkRLocOwbJlZV1GX3wB++0XVDp2tK2D0WmZjJDFtF27GPls\nMWx0Vho3awann1426O3VIoV6ziaNDeXGM3JyllNamgdAVFQ8jRv3L9c9FR9//DGPxVWVINxsw2QC\n/pPdUpz7ArkEuNb/DmNMpvNzs4gsBvoBlSYI5YL16+2YQ0EBRfPf4sfjnuenpf+kQYPGtGgxplyr\noGHDthFRLK6uyM+Hb74pGz/4+uuyKacnnmjnEfi6jDp08D0rGfidc8Eua/Yf9J7rNNKbN7cJwzfo\n3bOnJowwELETMuLjT6BNG/tvZEwJ+fnryo1p7Ngxi8zMxwGIjk4kMbE/zZqdSadOk0Mfk4stiAbY\nQeozsYlhKfBbY8yqCsedCMwHOhsnGBFpBuQbYwpFpCXwNTC2sgFu0BZEyK1dC2ecgSkqYtcrf2BD\no1kUFx+gXbv/o1One2jYsKXXEdYr+/cfOeW0qMguPuvbtywZDB16DFNOt20rP+j9o1OWomXLshaG\nL2HolwHPlJYWk5+/plz3VMOGbUlLO6IXPiheTnM9B5iOneb6nDHm7yIyBUg3xsxzjrkbiDPG3O73\nvFOBp7GF4KOA6caYZ6t6L00QIbRqFeaMMzAc4ofpTclO2kKzZiPp2nUaiYm9vI6uXti1y3YT+RLC\nihV2vWBMjN24xn/KadOmLgWxdWtZsli0yN4GO1vNv0uqe3dNGB47ll34dKGcCt7332POGE5xVD7L\nHynEnNCNrl2n0aLFL7ULyUXbt5cvauebsdqoEZxySllCGDTIwymnW7aUJYtFi8oq77VuXda6GD7c\n9nHp/5U6QxOECkrR0sVE/eJsihsU8P3jibQdOoXk5Gtr3cKeus4YuyGOf1G7LVvsY02alFU5HTrU\nVtWolVNOjbFB+8YvFi2y62TAVubzJYsRI+D44zVh1GKaIFSVSkuL2D3/rzS/5FFK4g07XvodycOm\n6zhDiJSW2jp8/vsg/Pyzfax167I1CMOG2X0QAqxpq/2Mgc2by7cwfvrJPta2bVmyGD4cunXThFGL\naIJQlcrK+oAdc6/hxBu2UdIkjuIFc0noOcrrsCLC3r3w9NPwj3+UnSs7dChf5TRiv1wbY6dI+7cw\ndjjTotu1K98lVZtLvdYDmiDUEfLyVrNp0y0Ufz6fPrdFQevWRC1egnTs6HVodd6mTTB9Ojz3nJ2O\nOnIkXHaZTQj19uP19av5D3rvdKqjJieX75Lq0kUTRhhpglCHFRVlsWXLPWRmzqTZD3Gk3V6MtGuP\nLFps/1BVjX31FTz6KLzzjq2V97vfwU03Qe/eXkdWCxlj19n4ksXixWX9bikpdjS+Rw976d4dTjgB\n4rQmlxs0QShKS4v46aen2LLlboqL99N127mkXPMx0qEDfPopJCV5HWKdVFxsE8K0aXZ/hGbN4Jpr\nYOJE/Uirxbc/qS9ZfPedXent2wskKsq2LLp3L0saPXrYGVONI3vHN7dpgqjnsrI+ZNOmm8nPX0uz\nZmdx/Lbf0OjiG+wf3Cef2FknqlpycmwX0uOP2/VkXbva1sIVV9j9fVQIFBbaVsaaNbB6tb2sWWMT\nSZHfroHt25dvbfh+Nm/uXex1iFelNpTH8vLWsGnTLWRnf0ijRt3o1WseLZY2QH59gW2yf/yxXfSk\ngpaRYQedn37arm4+7TTbrTRmTB2dfVSbxcbaaV1paeXvLy62Az2+xOH7+c9/ltUbAfvFxz9p+K63\naaNjHEHSBBGBioqy2bLlbjIzZxIdnUjXro+SnDyRqA8WwK/Ot6USFi6EFi28DrXOWL7cJoLXX7e9\nHhddBLfcYvdKUGHWoIH9gnPCCXD++WX3l5baciH+rY3Vq+Gll8pvbtWsWeDE0b69Jo4KtIspgthx\nhn+yZctdFBfvp127CXTqNMXuszB3LvzmN9CnDyxYYP9IVJVKS+HDD21iWLTI7vQ5fjzccAN06uR1\ndCpoxtgpthUTx+rVdhtcn8REmygqJo7OnSO6eahjEPVAVtZ8Z5xhDU2bnklq6mMkJjpN8zffhEsv\ntcty5893sXhPZDh40H7pnDbNlrxISbFJYfx4/egizu7dNmFUHOfI9Cs8HRtrWysVxzlSU2vpMvfq\n0TGICJaXt5ZNm252xhlS6dXrXVq0OK+sbtLrr9v5loMHwwcf2O0oVUC7dsHMmfaye7fdWe3ll+HX\nvz5iB1UVKVq1spdhw8rfv3//kYnjm2/gtdfKjmnQwCaJionjhBNsEa0IoAmijrLjDPeQmfkk0dEJ\ndO36CMnJ15Wvm/TSSzBunB1J/c9/bBNaHWHtWttaeOEFO3Hm3HPt+MLpp2uXdL3VpIn9UjV4cPn7\n8/PtLCr/1saqVfDuu3YbP7D/aTp3PnKc48QT69wXNE0QdYwdZ3jaGWfYR1LS1XTuPPXI/Zyffx7+\n8Ae7MnXePJ17WYExdrr9o4/a3BkXZ6eo3nij/TtWKqD4eOjXz178FRbadRsVxzkWLLDb/fmkpAQe\nIK+lE0Y0QdQh2dkfsXHjTc44wxnOOEOAZbqzZ8OECXDWWXZw2rP60LVPUZHtdZs2zc5MatUK7rnH\nLm7TGb+qxmJj7ezAnj3L319cbBfKVEwczzxjWyM+rVsfuY6jRw9b6NDDZqwmiDrAjjPcQnb2B8TF\ndaVXr7m0aDEm8P4MTz0Ff/oTnH02vP22lidw7NsHs2bBE0/Y8cfu3e3f6O9/rx+RclGDBrZ6bbdu\nMHZs2f2lpXY/jYqzql55pWwzcbCzIiq2Nnr0sFNyw7ANrM5iqsXsOMMUfvrpSaKi4unUabJdzxAV\nG/gJ//gHXH89nHcevPGG/VZTz/34o13t/OyzkJsLZ55pxxdGjdJtllUtZIwtYlgxcaxZY2dR+CQk\n2L5QX9Lo39/+p64BneZax5SWFrNjx9P8+ONkv3GGKTRsWMVmw9Om2TPfBRfYmRYRMP3uWCxZYscX\n3n7bJoJLL4Wbb7b7NytVJ+3ZU35mle9nRobde/bLL2v0sjrNtQ6x4ww3k5+/mqZNR5CaOj3wOIO/\nBx+E22+38zFffrnezsksKbGTSR591FZWbdoU/vIXuO46LVSrIkDLlnZ3qaFDy99/4ABkZ7vylpog\naon8/HVs3HgL2dn/Ofo4g79774U777RfkV94wfZ51jO5uXbS1vTptkRP5852rOHKK3Vmr6oHjjvO\ntemz9e9sUssUFe1l69YpZGbOICoqni5dHiYl5brKxxl8jLHTb+65x+5G869/RXQ5gEB++qmscN7e\nvXDKKbYxdf759e6jUMoVmiA8cuQ4w3hnPUMV4ww+xsCkSXDfffZr8jPP1Ksz4ooVdsjl1Vdtt9IF\nF9jhl1NO8ToypSKLJggPZGcvcNYz+MYZHiMxsU9wTzYGbrsNHn7YrnV46ql6MR3HGFtGato0W6U8\nIcGuXbjB2dZCKRV6miDCKD9/HZs2/ZmsrPeJi+tKz57v0LLl2KOPM/gYY6fiTJ9u1zr84x8RnxwK\nCuy4+7RpdsJGu3bwwAM2N2pBWqXcpQkiDMqPMzSiS5eHSEm5/ujjDP6MsWscZsywX5sfeyyiCwXt\n2WMbRzNm2OnfffrYMfiLL673M3iVChtNEC6y4wyznHGGbL/1DNXc4rO01LYYnn4a/vxneOihiE0O\n69bZ3Ddnjm09nHOOHV8YMSJif2Wlai1NEC7Jzl7ojDOsomnT4c56hiDHGfyVltr+lGefhTvugL//\nPeLOlMbA55/b9QvvvWcXgF92md3juUcPr6NTqv7SBBFi5ccZutCz59u0bHl+8OMM/kpK4Kqr7Nfp\nyZPh7rsjKjkUFdm9jB59FJYts+uA7rrLNpZaBzGZSynlLlcThIiMBh4HooHZxpgHKjz+GDDCuRkP\ntDbGNHUeGwdMch671xgzx81Yj5UdZ5hKZuY/aj7O4K+42NaffvllmDLFLoaLEPv325m5Tzxh65Wd\ncILtPbvssojZZ0WpiOBaghCRaOBJYCSQASwVkXnGmNW+Y4wxN/kdfx3Qz7neHLgLGAgYYJnz3L1u\nxVtTdpzhGX788U5nnMG3nqGa4wz+iors2fL11+H++20ZjQiwdastnDd7NuTk2HGFmTPtOEOET8ZS\nqk5yswVxMrDRGLMZQEReA8YCqys5/lJsUgAYBSw0xmQ7z10IjAZedTHeaqs4ztC162M0bnyM1eAO\nHYLf/hbeegseecSO0NZx//uf7UZ66y3bQ3bxxXa2bv/+XkemlKqKmwkiGdjudzsDGBToQBHpCHQG\nPq3iuUeUWxORCcAEgA4dOhx7xEHKz1/vjDO8d+zjDP4KC+3Z89137VqHG24ITcAeKCmxA86PPgpf\nfGF3cLz5Zls4r317r6NTSgWjtgxSXwK8aYwpqc6TjDGzgFlgy327EZi/oqJ9fuMMcXTp8iApKTfU\nfJzBX0EBXHSR3f9yxgy49tpjf00P5OXZMfXHHrM7MHbsaK9fdRU0bux1dEqp6nAzQWQC/t8VU5z7\nArkE8D8jZgLDKzx3cQhjqxbfOMOWLZMpKsoiKekqOne+99jGGfwdPGgLCn30kR2tnTAhNK8bRjt2\n2Lz2z3/aysODBtlSURdcUC8LzCoVEdz8010KdBORztgT/iXAbyseJCInAs2Ar/3u/gi4T0R8xRR+\nAdzhYqyVys7+mE2bbiIv7weaNDmd1NTpxz7O4C8/325F+Mkndq3DH/4QutcOg5UrbRmMV16xY+vn\nn2+HTU49NaJm5CpVL7mWIIwxxSIyEXuyjwaeM8asEpEpQLoxZp5z6CXAa8ZvaztjTLaITMUmGYAp\nvgHrcMnP38CmTbc44wyd6dnzLVq2vODYxxn85eba7UE//9xuaHD55aF7bRcZAwsX2vGFBQsgPt42\nem64AVJTvY5OKRUquuVoBRXHGTp2/BvJyTcQHR3ine1zcuz8zq++ghdftDOX6oDCQjjjDBt2UpId\ndP7jH6F5c68jU0rVhG45GgQ7zjCbLVvupKgoi7Zt/0DnzvcSG9s29G+2fz+cfbad//naa3ar0Dpi\n5kybHB5/3CaG2BCMzyulaidNEMDevZ+wceONzjjDMGecoZ87b7ZvH4waBd9+C//+N1x4oTvv44K9\ne2HqVPjFL2xhWaVUZKv3CSI/fz0rVpzljDO8ScuWF4Z2nMFfdjaMHGlHdt96C8aMced9XPL3v9v8\n9vDDXkeilAqHep8g4uOPp1eveTRrNjL04wz+9uyBs86CtWth7lw7/lCH/Pij3Z9o3Djo3dvraJRS\n4VDvEwRAy5bnufsGu3bZ5LBhA8ybZ/to6pi//c1uez11qteRKKXCRUukuW3nTluVbuNGeP/9Opkc\n0tPh1VdtqYyUFK+jUUqFi7Yg3PTTT3ZOaEYGfPghnH661xFVmzF2E7tWreDWW72ORikVTpog3LJ9\nu00OO3fC/Plw2mleR1Qj778Pn30GTz4Jxx3ndTRKqXDSBOGGrVttt1JWll1yPHiw1xHVSHGxbTUc\nfzxcfbXX0Silwk0TRKht3mxbDvv3w8cfw0kneR1RjT37rJ109c47EBPjdTRKqXDTBBFKGzfalkN+\nvi2+V4d3xMnJsftDn3aarSWolKp/NEGEyrp1tuVw6BB8+in06eN1RMfk4Yfh55/t3kValVWp+kkT\nRCisXg1nngmlpbBoEfTq5XVEx+Snn2yl1t/8xu7roJSqn3QdxLH64QcYPtxeX7y4zicHgMmT7d4O\n99/vdSRKKS9pgjgWK1bY5BATY5ND9+5eR3TMfvgB/vUvmDgRunTxOhqllJc0QdTUt9/aMYf4eLtQ\n4IQTvI4oJG691a53mDTJ60iUUl7TMYiaWLrUlsxo0sSOOXTu7HVEIfHJJ3bB98MP6wZASiltQVTf\nkiW28F7z5rblECHJobQU/vIX6NjRdi8ppZS2IKrjyy/tTnBt2tiWQwRVrnv5ZVi+3P6Mc7HquVKq\n7giqBSEib4vIL0Wk/rY4PvvM7gTXrp29HkHJ4eBBW857wAC45BKvo1FK1RbBnvBnAr8FNojIAyIS\nGSOywfr0U9ty6NjRzlZq187riELq8cdtbcFHHoGo+vsVQClVQVCnA2PMx8aY3wH9gS3AxyLylYhc\nKSKRXaVnwQL45S8hNdV2K7Vt63VEIbV7t13vcO65Zcs5lFIKqjFILSItgCuA8cBy4HFswljoSmS1\nwQcf2H2jTzzRtiJat/Y6opCbOhXy8uChh7yORClV2wQ1SC0i7wAnAC8C5xljdjgPvS4i6W4F56n3\n3oOLLoK0NNuKiMB5nxs2wFNPwfjxEbHGTykVYsHOYnrCGLMo0APGmIEhjKd2eOcdW4iof3/46CNo\n2tTriFxxxx0QGwt33+11JEqp2ijYLqYeInL4LCkizUTkTy7F5K033oBf/9ru47BgQcQmh6++grfe\nsiunI2xYRSkVIsEmiKuNMft8N4wxe4HI22Ps1Vfh0kvh1FNty6FJE68jcoVvn+mkJLjlFq+jUUrV\nVsF2MUWLiBhjDICIRAMN3QvLAy++CFdcAcOG2Y2YExK8jsg1b70FX38NzzwT0b+mUuoYBZsg5mMH\npJ92bv/RuS8yPPecHak980y7Q058vNcRuebQIbj9dujZE6680utolFK1WbBdTLcBi4BrnMsnwK1H\ne5KIjBaRdSKyUURur+SY34jIahFZJSKv+N1fIiLfOZd5QcZZfWvX2uQwahTMmxfRyQHgn/+ETZvs\ntNboaK+jUUrVZuL0GoX+hW031HpgJJABLAUuNcas9jumG/Bv4AxjzF4RaW2M2eU8lmuMSQz2/QYO\nHGjS02s44/a992DkyIgvQrRvn13v16cPfPyxbiWqlAIRWVbZbNRg10F0A+4HegCHz6LGmKq2lDkZ\n2GiM2ey8xmvAWGC13zFXA086g974kkPYnXeeJ28bbg88ANnZtqSGJgel1NEE28X0L+ApoBgYAbwA\nvHSU5yQD2/1uZzj3+TseOF5EvhSRJSIy2u+xOBFJd+4/P9AbiMgE55j03bt3B/mr1E/btsH06fD7\n30O/fl5Ho5SqC4JNEI2MMZ9gu6S2GmPuBn4ZgvdvAHQDhgOXAs/4rbfo6DR7fgtMF5GuFZ9sjJll\njBlojBnYqlWrEIQTuXw7xN17r7dxKKXqjmATRKFT6nuDiEwUkQuAo40PZALt/W6nOPf5ywDmGWOK\njDE/YscsugEYYzKdn5uBxYB+762h5cvhpZfgxhuhQwevo1FK1RXBJogbgHjgemAA8Htg3FGesxTo\nJiKdRaQhcAlQcTbSXGzrARFpie1y2uys1I71u38I5ccuVJB8i+KaN7elNZRSKlhHHaR2ZiNdbIz5\nM5ALBDV73hhTLCITgY+AaOA5Y8wqEZkCpBtj5jmP/UJEVgMlwF+MMVkicirwtIiUYpPYA/6zn1Tw\nPvzQFqJ9/PGIXRiulHJJUNNcRWSJMWZwGOKpsWOa5hqhiouhb18oLIRVq6BhZK19V0qFwDFPcwWW\nO4vV3gDyfHcaY94OQXzKJc8/bxPDG29oclBKVV+wCSIOyALO8LvPAJogaqm8PJg8GU45BX71K6+j\nUUrVRUElCGOMVu2pYx59FHbsgDff1EVxSqmaCXYl9b+wLYZyjDF/CHlE6pjt3GlrLf3qV7ZyuVJK\n1USwXUzv+12PAy4Afgp9OKB4R14AABlXSURBVCoU7r7bDkzff7/XkSil6rJgu5je8r8tIq8CX7gS\nkTomq1fD7Nnwpz9Bt25eR6OUqsuCXShXUTegdSgDUaFx2212E6DJk72ORClV1wU7BpFD+TGIndg9\nIlQtsnix3Qzv/vuhZUuvo1FK1XXBdjE1djsQdWxKS21Jjfbt4YYbvI5GKRUJgupiEpELRKSJ3+2m\nlZXgVt547TVYtsxWa23UyOtolFKRINgxiLuMMft9N4wx+4C73AlJVVdBAfz1r7asxu9/73U0SqlI\nEew010CJJNjnKpfNmAFbt8Kzz0JUTacdKKVUBcGeTtJFZJqIdHUu04BlbgamgpOdDX//O5x9Npx5\nptfRKKUiSbAJ4jrgEPA68BpQAFzrVlAqePfeCwcO2JXTSikVSsHOYsoDbnc5FlVNmzfb7qUrr4Re\nvbyORikVaYKdxbTQb69onB3fPnIvLBWMO+6AmBiYMsXrSJRSkSjYLqaWzswlAIwxe9GV1J765hv4\n97/hllugXTuvo1FKRaJgE0SpiBze7l5EOhGguqsKD98+061bw1/+4nU0SqlIFexU1b8BX4jIZ4AA\nQ4EJrkWlqvTuu/DFF/DUU9BY17grpVwS1J7UACLSGpsUlgONgF3GmM9djK1a6sue1EVFdkA6KgpW\nroQGuhpFKXUMjnlPahEZD9wApADfAYOBrym/BakKg2eegfXrYd48TQ5KKXcFOwZxA3ASsNUYMwLo\nB+yr+ikq1A4csJsBnX46nHuu19EopSJdsAmiwBhTACAiscaYtcAJ7oWlAnnoIdi9Gx55RPeZVkq5\nL9hOigxnHcRcYKGI7AW2uheWqigjAx59FC69FAYG7C1USqnQCnYl9QXO1btFZBHQBJjvWlTqCHfe\nafd8uO8+ryNRStUX1R7mNMZ85kYgqnIrVsCcOXDzzdCpk9fRKKXqCy0OXQfceis0bQp/+5vXkSil\n6hOdKFnLLVhgL48+Cs2aeR2NUqo+cbUFISKjRWSdiGwUkYDVYEXkNyKyWkRWicgrfvePE5ENzmWc\nm3HWViUltpRG585wrRZXV0qFmWstCBGJBp4ERgIZwFIRmWeMWe13TDfgDmCIMWavs1obEWmO3dJ0\nILbm0zLnuXvdirc2evFF+P57u990bKzX0Sil6hs3WxAnAxuNMZuNMYewGw2NrXDM1cCTvhO/MWaX\nc/8oYKExJtt5bCEw2sVYa538fJg0CU4+GX7zG6+jUUrVR24miGRgu9/tDOc+f8cDx4vIlyKyRERG\nV+O5iMgEEUkXkfTdu3eHMHTvPfYYZGbqojillHe8nsXUAOgGDAcuBZ7x35joaIwxs4wxA40xA1u1\nauVSiOG3axc8+CCMHQtDh3odjVKqvnIzQWQC7f1upzj3+csA5hljiowxPwLrsQkjmOdGrHvusV1M\nDz7odSRKqfrMzQSxFOgmIp1FpCFwCTCvwjFzsa0HRKQltstpM/AR8Atna9NmwC+c+yLeunXw9NMw\nYQKcoNWulFIecm0WkzGmWEQmYk/s0cBzxphVIjIFSDfGzKMsEawGSoC/GGOyAERkKjbJAEwxxmS7\nFWttcvvt0KgR3HWX15Eopeq7oDcMqu0iYcOg//4Xhg2De+/VVdNKqfCoasMgrweplcMYuyguORlu\nusnraJRSSktt1BpvvAHffAPPPQfx8V5Ho5RS2oKoFQoL4Y47IC0NLr/c62iUUsrSFkQtMHMmbN4M\n8+dDdLTX0SillKUtCI/t3QtTp8LIkTBqlNfRKKVUGU0QHrvvPti3Dx5+2OtIlFKqPE0QHtqyBZ54\nwo479OnjdTRKKVWeJggP/e1vEBVl1z0opVRtownCI+np8Mordp/plBSvo1FKqSNpgvCAb1Fcq1Zw\n221eR6OUUoHpNFcP/Oc/sHgxzJgBxx3ndTRKKRWYtiDCrLjYth6OP95WbFVKqdpKWxBh9uyzsHYt\nvP02xMR4HY1SSlVOWxBhlJNjy3gPGQLnn+91NEopVTVtQYTRI4/Azz/D3Lm6z7RSqvbTFkSY/PST\nTRC//jUMHux1NEopdXSaIMLkrrugqAjuv9/rSJRSKjiaIMJg1Sq7z8O110LXrl5Ho5RSwdEEEQa3\n3gqNG8OkSV5HopRSwdNBapd98gl88AE89BC0aOF1NEopFTxNEC4qLbWL4jp2hOuu8zoapdxXVFRE\nRkYGBQUFXoeiKoiLiyMlJYWYaizA0gThopdfhuXL4aWXIC7O62iUcl9GRgaNGzemU6dOiM7lrjWM\nMWRlZZGRkUHnzp2Dfp6OQbjk4EFbzrt/f7j0Uq+jUSo8CgoKaNGihSaHWkZEaNGiRbVbdtqCcMkT\nT8D27TBnjt3zQan6QpND7VSTfxc9dblgzx67legvfwkjRngdjVJK1YwmCBdMnQq5uXbmklJKHc2P\nP/7IoEGDSE1N5eKLL+bQoUNHHJOVlcWIESNITExk4sSJYYlLE0SIbdwIM2fC+PHQo4fX0Sil6oLb\nbruNm266iY0bN9KsWTOeffbZI46Ji4tj6tSpPPLII2GLS8cgQuyOOyA2Fu65x+tIlPLWjTfCd9+F\n9jX79oXp06s+ZurUqbz00ku0atWK9u3bM2DAAJo0acKsWbM4dOgQqampvPjii8THx3PFFVfQqFEj\nli9fzq5du3juued44YUX+Prrrxk0aBDPP/88AImJiVxzzTV88MEHJCUlcd9993Hrrbeybds2pk+f\nzpgxY9iyZQuXXXYZeXl5AMyYMYNTTz31qL+TMYZPP/2UV155BYBx48Zx9913c80115Q7LiEhgdNO\nO42NGzdW/4OrIW1BhNBXX8Gbb9q1D23beh2NUvXP0qVLeeutt1ixYgUffvgh6enpAFx44YUsXbqU\nFStW0L1793Lf0Pfu3cvXX3/NY489xpgxY7jppptYtWoVK1eu5Dsnw+Xl5XHGGWewatUqGjduzKRJ\nk1i4cCHvvPMOkydPBqB169YsXLiQb7/9ltdff53rr78egJycHPr27Rvwsnr1arKysmjatCkNGtjv\n6ykpKWRmZobzY6uUqy0IERkNPA5EA7ONMQ9UePwK4GHA92nMMMbMdh4rAVY6928zxoxxM9ZjZQz8\n+c82Mdxyi9fRKOW9o33Td8OXX37J2LFjiYuLIy4ujvPOOw+AH374gUmTJrFv3z5yc3MZNWrU4eec\nd955iAhpaWm0adOGtLQ0AHr27MmWLVvo27cvDRs2ZPTo0QCkpaURGxtLTEwMaWlpbNmyBbCLBCdO\nnMh3331HdHQ069evB6Bx48aHE00ge/bsceOjCAnXEoSIRANPAiOBDGCpiMwzxqyucOjrxphAIy4H\njTF93Yov1N5+G77+GmbNgsREr6NRSvm74oormDt3Ln369OH5559n8eLFhx+LjY0FICoq6vB13+3i\n4mIAYmJiDk8T9T/O/5jHHnuMNm3asGLFCkpLS4lzVsfm5OQwdOjQgHG98sordO/enX379lFcXEyD\nBg3IyMggOTk5tB9ADbnZxXQysNEYs9kYcwh4DRjr4vt55tAhuP12Oyh95ZVeR6NU/TVkyBDee+89\nCgoKyM3N5f333wfsSTopKYmioiJefvllV957//79JCUlERUVxYsvvkhJSQlQ1oIIdOnRowciwogR\nI3jzzTcBmDNnDmPH1o5TpZsJIhnY7nc7w7mvol+JyPci8qaItPe7P05E0kVkiYgE3KBTRCY4x6Tv\n3r07hKFXz9NP29lLDz0EDXTYXynPnHTSSYwZM4bevXtz9tlnk5aWRpMmTZg6dSqDBg1iyJAhnHji\nia6895/+9CfmzJlDnz59WLt2LQkJCUE/98EHH2TatGmkpqaSlZXFVVddBcC8efMOj3EAdOrUiZtv\nvpnnn3+elJQUVq+u2CETWmKMceeFRS4CRhtjxju3LwMG+XcniUgLINcYUygifwQuNsac4TyWbIzJ\nFJEuwKfAmcaYTZW938CBA41vQCqc9u+3ezz07m0rt+oiUlWfrVmzhu7du3saQ25uLomJieTn5zNs\n2DBmzZpF//79PY2ptgj07yMiy4wxAwMd7+b33UzAv0WQQtlgNADGmCy/m7OBh/wey3R+bhaRxUA/\noNIE4ZUHHoCsLLudqCYHpbw3YcIEVq9eTUFBAePGjdPkcAzcTBBLgW4i0hmbGC4Bfut/gIgkGWN2\nODfHAGuc+5sB+U7LoiUwBL/kUVts22Znavz+97Yon1LKe771BOrYuZYgjDHFIjIR+Ag7zfU5Y8wq\nEZkCpBtj5gHXi8gYoBjIBq5wnt4deFpESrHjJA8EmP3kuUmT7PTWe+/1OhKllAo9V4dUjTEfAB9U\nuG+y3/U7gDsCPO8rIM3N2I6Vb58H34ZASikVaXQldQ0YYxND8+a2tIZSSkUinZRZA/Pn2xlL06dD\n06ZeR6OUUu7QFkQ1lZTArbfaqa0VamkppVSNzJgxg9TUVESkytIbc+bMoVu3bnTr1o05c+a4Hpe2\nIKrp+efhhx/g3/+Ghg29jkYpFQmGDBnCueeey/Dhwys9Jjs7m3vuuYf09HREhAEDBjBmzBiaNWvm\nWlyaIKohLw/uvBMGD4aLLvI6GqVqtw0bbiQ3N7T1vhMT+9KtW9VVAOtauW+Afv36HfWYjz76iJEj\nR9K8eXMARo4cyfz587nUxU3vtYupGqZNgx07dFGcUrVVXSz3HazMzEzaty9bexyOsuDaggjSzp3w\n4INw4YUwZIjX0ShV+x3tm74b6mK579pME0SQ7r4bCgttaQ2lVN1Sm8t99whyb+Lk5ORycWdkZFQ5\nZhEK2sUUhDVrYPZs+L//g27dvI5GKVWZuljuO1ijRo1iwYIF7N27l71797JgwYJyLSE3aIIIwm23\nQUIC+FXdVUrVQnW13PcTTzxBSkoKGRkZ9O7dm/HjxwOQnp5++Hrz5s258847OemkkzjppJOYPHny\n4QFrt7hW7jvc3Cr3/dlnMHw43HefrppW6mi03HftVpvKfdd5paV2n+mUFLjxRq+jUUoFQ8t9h44m\niCq8/jqkp9vFcY0aeR2NUioYWu47dHQMohKFhfDXv0KfPna/B6WUqm+0BVGJGTNgyxZYuBCio72O\nRimlwk9bEAFkZ9tNgEaPhrPO8joapZTyhiaIAO69Fw4cgIdq3SanSikVPpogKti82XYvXXEFpNXq\nPe2UUpGisnLfxhiuv/56UlNT6d27N99++23A5y9btoy0tDRSU1O5/vrrCdXyBU0QFfz1r9CgAUyZ\n4nUkSqn6YsiQIXz88cd0rLB/8YcffsiGDRvYsGEDs2bN4ppKNqG55ppreOaZZw4fO3/+/JDEpYPU\nfv73Pzu1ddIkSE72Ohql6rgbb4RQF6nr29du5ViFSCr3/e6773L55ZcjIgwePJh9+/axY8cOkpKS\nDh+zY8cODhw4wODBgwG4/PLLmTt3LmeffXZQ710VbUE4jLGL4lq3tjvGKaXqnkgr9x1Mie/MzExS\nUlKqPKamtAXhmDcP/vtfmDkTGjf2OhqlIsBRvum7Qct9h5YmCKCoyBbkO/FEcOpiKaUiSF0t952c\nnMz27dsP387IyCC5Qv93cnIyGRkZVR5TU9rFhC3lvW6d3RAoJsbraJRSNRVp5b7HjBnDCy+8gDGG\nJUuW0KRJk3LjDwBJSUkcd9xxLFmyBGMML7zwAmPHjg3J71TvE8SBA3DXXTBsGDitUaVUHRVp5b7P\nOeccunTpQmpqKldffTUzZ848/Jy+ffsevj5z5kzGjx9PamoqXbt2DckANWi5b3bsgGuvtaW8TzrJ\nhcCUqke03HftpuW+qykpCd5+2+solFKhouW+Q6feJwilVGTRct+h4+oYhIiMFpF1IrJRRG4P8PgV\nIrJbRL5zLuP9HhsnIhucyzg341RKhU6kdFtHmpr8u7jWghCRaOBJYCSQASwVkXnGmIorQ143xkys\n8NzmwF3AQMAAy5zn7nUrXqXUsYuLiyMrK4sWLVocnhaqvGeMISsr6/DU22C52cV0MrDRGLMZQERe\nA8YCVS8dtEYBC40x2c5zFwKjgVddilUpFQK+mTi7d+/2OhRVQVxcXLkV18FwM0EkA9v9bmcAgwIc\n9ysRGQasB24yxmyv5LlHrPwQkQnABIAOHTqEKGylVE3FxMTQuXNnr8NQIeL1Ooj3gE7GmN7AQmBO\ndZ5sjJlljBlojBnYqlUrVwJUSqn6ys0EkQm097ud4tx3mDEmyxhT6NycDQwI9rlKKaXc5WaCWAp0\nE5HOItIQuASY53+AiPivGR8DrHGufwT8QkSaiUgz4BfOfUoppcLEtTEIY0yxiEzEntijgeeMMatE\nZAqQboyZB1wvImOAYiAbuMJ5braITMUmGYApvgHryixbtmyPiGw9hpBbAnuOelT4aVzVo3FVj8ZV\nPZEYV8fKHoiYUhvHSkTSK1tu7iWNq3o0rurRuKqnvsXl9SC1UkqpWkoThFJKqYA0QZSZ5XUAldC4\nqkfjqh6Nq3rqVVw6BqGUUiogbUEopZQKSBOEUkqpgOptghCRX4vIKhEpFZFKp4cdrWS5C3E1F5GF\nTpnzhc5CwUDHlfiVSZ8X6JgQxXO0ku2xIvK68/g3ItLJrViqEVOlZeRdjus5EdklIj9U8riIyBNO\n3N+LSFh2sgkiruEist/v85ocprjai8giEVnt/C3eEOCYsH9mQcYV9s9MROJE5H8issKJ654Ax4T2\n79EYUy8vQHfgBGAxMLCSY6KBTUAXoCGwAujhclwPAbc7128HHqzkuNwwfEZH/f2BPwH/dK5fgi3f\n7nVMVwAzPPg/NQzoD/xQyePnAB8CAgwGvqklcQ0H3vfg80oC+jvXG2MLdlb8twz7ZxZkXGH/zJzP\nING5HgN8AwyucExI/x7rbQvCGLPGGLPuKIcdLllujDkE+EqWu2ksZUUL5wDnu/x+VQnm9/eP903g\nTHF3IwAv/k2CYoz5HFsRoDJjgReMtQRoWqHcjFdxecIYs8MY861zPQdbaqdi1eawf2ZBxhV2zmeQ\n69yMcS4VZxmF9O+x3iaIIAVVdjzE2hhjdjjXdwJtKjkuTkTSRWSJiLiVRIL5/Q8fY4wpBvYDLVyK\nJ9iYwJaR/15E3hSR9gEe94IX/5+CdYrTdfGhiPQM95s7XSH9sN+K/Xn6mVURF3jwmYlItIh8B+zC\n7plT6ecVir/HiN6TWkQ+BtoGeOhvxph3wx2PT1Vx+d8wxhgRqWweckdjTKaIdAE+FZGVxphNoY61\njnoPeNUYUygif8R+ozrD45hqs2+x/59yReQcYC7QLVxvLiKJwFvAjcaYA+F636M5SlyefGbGmBKg\nr4g0Bd4RkV7GmIBjS6EQ0QnCGHPWMb6EK2XHq4pLRH4WkSRjzA6nKb2rktfIdH5uFpHF2G85oU4Q\nwfz+vmMyRKQB0ATICnEc1YrJGOP//rOx4zq1Qa0sY+9/8jPGfCAiM0WkpTHG9aJ0IhKDPQm/bIx5\nO8AhnnxmR4vLy8/Mec99IrIIu9Omf4II6d+jdjFV7agly10wDxjnXB8HHNHSEVsGPda53hIYQnBb\nuVZXML+/f7wXAZ8aZ4TMJcdSRt5r84DLnZk5g4H9ft2JnhGRtr5+ahE5GXtecDPJ+95XgGeBNcaY\naZUcFvbPLJi4vPjMRKSV03JARBoBI4G1FQ4L7d9jOEfha9MFuADbn1kI/Ax85NzfDvjA77hzsLMY\nNmG7ptyOqwXwCbAB+Bho7tw/EJjtXD8VWImdwbMSuMrFeI74/YEpwBjnehzwBrAR+B/QJQyf0dFi\nuh9Y5Xw+i4ATw/R/6lVgB1Dk/N+6Cvg/4P+cxwV40ol7JZXMnvMgrol+n9cS4NQwxXUadpD1e+A7\n53KO159ZkHGF/TMDegPLnbh+ACY797v296ilNpRSSgWkXUxKKaUC0gShlFIqIE0QSimlAtIEoZRS\nKiBNEEoppQLSBKFUACKSe/SjKn3uRKeapnHWqfjur7QyqYgkicj7frdPFpHPxVatXS4is0UkXkTO\nFZEpNf/NlAqeJgilQu9L4Cxga4X7z8aWY+gGTACe8nvsZuAZABFpg53Lfpsx5gRjTD9gPray6H+A\n80Qk3tXfQCk0QShVJedb/8Mi8oOIrBSRi537o5zyCmvF7tvxgYhcBGCMWW6M2RLg5aqqTPorbBIA\nuBaYY4z52vdEY8ybxpifjV24tBg415VfWCk/miCUqtqFQF+gD7ZV8LBzUr8Q6AT0AC4DTgnitQJW\nJhWRzsBeY0yhc38vYFkVr5MODK3G76BUjWiCUKpqp2Erw5YYY34GPgNOcu5/wxhTaozZiS3pUVNJ\nwO5qHL8LWxJGKVdpglAqfCqrTHoQW0PHZxUwoIrXiXOeo5SrNEEoVbX/Ahc7G7W0wm7f+T/sQPSv\nnLGINtgtKI+mssqk67HdVT4zgHEiMsh3h4hc6LwPwPGUL/GslCs0QShVtXew1TNXAJ8CtzpdSm9h\nxxBWAy9hN5DZDyAi14tIBraF8L2IzHZe6wNgM7bS5jPY/YMxxuQBm0Qk1bn9M7aM+SPONNc1wCgg\nx3mdEdjZTEq5Squ5KlVDIpJo7I5iLbCtiiFO8qjJa10ADDDGTDrKcW2AV4wxZ9bkfZSqjojeUU4p\nl73vbODSEJha0+QAYIx5x0k0R9MBuKWm76NUdWgLQimlVEA6BqGUUiogTRBKKaUC0gShlFIqIE0Q\nSimlAtIEoZRSKqD/B25YBrGmuFoWAAAAAElFTkSuQmCC\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "Kqp7qoyHvUIt",
        "colab_type": "text"
      },
      "source": [
        "### 9.2 原始数据进行PCA降维，然后进行MinMaxScale, 再训练RBF核的SVM"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "HMr2YjPgvwII",
        "colab_type": "code",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        },
        "outputId": "f2ab32ad-6543-40dc-af00-0191dbb3c060"
      },
      "source": [
        "train_val_data_pca = TrainValData(org_pca_data_path)  # 解析对原始数据进行PCA降维后的数据，并切割成训练集和校验集\n",
        "search_pca = SVMSearch(train_val_data_pca)       # 创建超参数搜索对象\n",
        "search_pca.search()                    # 超参数C和gamma搜索\n",
        "search_pca.draw()                     # 画出不同超参数组合下的accuracy曲线\n",
        "search_pca.get_best_params()                # 找到最佳超参数组合\n",
        "search_pca.save_best_model(\"./model/Otto_pca_rbf_svc.pkl\") # 保存模型"
      ],
      "execution_count": 30,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "start trying: C=0.1, gamma=0.1\n",
            "accuracy=0.349\n",
            "\n",
            "start trying: C=0.1, gamma=1.0\n",
            "accuracy=0.608\n",
            "\n",
            "start trying: C=0.1, gamma=10.0\n",
            "accuracy=0.6945\n",
            "\n",
            "start trying: C=1.0, gamma=0.1\n",
            "accuracy=0.6065\n",
            "\n",
            "start trying: C=1.0, gamma=1.0\n",
            "accuracy=0.706\n",
            "\n",
            "start trying: C=1.0, gamma=10.0\n",
            "accuracy=0.756\n",
            "\n",
            "start trying: C=10.0, gamma=0.1\n",
            "accuracy=0.705\n",
            "\n",
            "start trying: C=10.0, gamma=1.0\n",
            "accuracy=0.745\n",
            "\n",
            "start trying: C=10.0, gamma=10.0\n",
            "accuracy=0.7675\n",
            "\n",
            "start trying: C=100.0, gamma=0.1\n",
            "accuracy=0.734\n",
            "\n",
            "start trying: C=100.0, gamma=1.0\n",
            "accuracy=0.7595\n",
            "\n",
            "start trying: C=100.0, gamma=10.0\n",
            "accuracy=0.768\n",
            "\n",
            "start trying: C=1000.0, gamma=0.1\n",
            "accuracy=0.7595\n",
            "\n",
            "start trying: C=1000.0, gamma=1.0\n",
            "accuracy=0.76\n",
            "\n",
            "start trying: C=1000.0, gamma=10.0\n",
            "accuracy=0.7295\n",
            "\n",
            "best_C_=100.0, best_gamma_=10.0, best_accuracy_=0.768\n",
            "start fitting with best params...\n",
            "end fitting with best params.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXhU5fXA8e/JRliSsEUIBCGQsIMo\nICpKte5WsWoXtdZdFEQEtWqr4oLVYq0roEWlgNbKT1SkCiLuqEgJq4hIAoIksgaykn3e3x/vhExC\nEiZhbu4kcz7PMw8zd+7MnLlk3nPvu4oxBqWUUqErzO0AlFJKuUsTgVJKhThNBEopFeI0ESilVIjT\nRKCUUiEuwu0A6qtjx46mR48eboehlFJNyqpVq/YZY+Jreq7JJYIePXqQmprqdhhKKdWkiMj22p7T\nqiGllApxmgiUUirEaSJQSqkQp4lAKaVCnCYCpZQKcZoIlFIqxGkiUEqpENfkxhEoFTQ8HsjNhf37\n7S07G8rLoWJqd2Oq3nRb1W0RETBkCJx6KrRvH5j/E9UgmgiUqijQs7IqC/X9+4/8eP9++1p19AYM\ngNNOq7x16+Z2RCFFE4FqPjweyMnxvyCveHzgQN0FemysPWNt3x46dIDu3as+rrgfF2fPckXsDSrv\n67aq2wCKiiA1FZYtgy++gNdfhxdftM91724TwqhR9t8+faq+VgWUNLUVyoYNG2Z0iolmrqJAr+8Z\nuj8Fum/BXb0gr+lxu3YQGdl43z2UlZfD+vU2MVTcdu+2z8XH2yqkiiuGIUNs0lV+E5FVxphhNT2n\nR1I5p7y84WfodZ2gxMVVLbiTko5csLdtqwW6i8rK8ikt3U1JSeWttHQvxpR59zCAwcQZuNDAhb3B\nJBOxbT/RK3cQnbqDlv/7lMh33gHA0zqSoiFdKBzWlcLhXSk+rhOeFuHe9/F5P2Nq2Vb5uLG2AT7x\n+LfN4zEcPGgoKICDBw0JCbczYsRF9T7+R6KJQB1ZRYHekDP0IxXovgV3UpJ/Z+h6Jug6Ywzl5bnV\nCvbqBX3lfY/noB/vKoduUlENFCFwsr2JCFH7WhC33kPsekPc+h20f247YsATAXl9w8kdHEHOceHk\nDYqgPCbc5z3xvqc04jZ8vkvd28rLhaIieyssFAoLobDQPvZ4KvfLyytjxAg/DmU96S9KHW7dOpg1\nCxYvhn37bG+Yugr0tm2rFtq9eh252qVtWy3Qg4wxhrKyA34V7CUluzGmuIZ3ESIj44mK6kRUVCdi\nY3sduh8V1YnISN/78YSFNeAq7VKf+wcOwNdfE7ZsGXHLlhH35kq6vV5s2xMGDaraAN2lS0MPTUAY\nAxkZ8P33sGmTvVXc37Wrcr/ISOjdG/r2hX79Kv/t3RvatHEmNm0jUNb+/baxbtYsWLMGoqLg3HNr\nbxiteNy2LYSHux29qoUxHkpLs/wq2EtL92BMaQ3vEk5UVHyVQryyUO9cbVtHRFz8eygshBUrKtsY\nvv4aCgrsc716VU0MycmONECXlEB6emUh71vwV4QC9oK4X7/Kwr6iwE9KcuYcqa42Ak0Eoay8HD76\nyBb+CxbYv+Djj4frr4crrrAFvQo6xpRTUrLXj4J9NyUle4Hyw95DJJLIyGNqPVv3fRwZ2QGRJjr2\ntKwM1q6t2gC9b599rlOnqolh8OB6ndTk5FQt6Cv+3bLF/rQqdOt2+Nl937724xuzI5QmAlVVejrM\nng1z5thr1fbt4aqr4LrrbG8M1eg8nlJKS/fWWqiXlOw6tK20dB+VDaCVRFr4VbBHRXUiIqJdZT18\nKDHGlta+iWG7d72W2Fg45ZTKLqvDh2OiWpCZWfPZ/c6dlW8bGQkpKYcX+H36OFedU1+aCJS9Jp0/\n3579f/EFhIXZqp/rr4eLLoIWLdyOsNnxeIopKdnjV7VMWVlWje8RFtbKr4I9KqoT4eGxoVm4H60d\nOyj7dBl5i5YR9vUy4nZ8B0CxtCBVTuQzz2ks4zS+5hQkNrZKdU7Fv0lJwd8pTRNBqDIGli+3hf+8\neZCfb+tFr78err4aunZ1O8Imz+Mpo6BgHdnZX5CX9z+Ki38+VNCXlWXX+Jrw8Bi/CvbIyE5ERATJ\n6WQzkZMDP/xQc3VOmbcna3uy+HWHLzmv9TKGFy+j297VhHvKMGFhcNxxiG91UqdO7n6hetBEEGp2\n7oS5c+Ff/7J/9a1bw+9+ZxPAyJE6QvMoeDzF5OauJCfnC7KzvyA392vKy/MAiI7uQYsWxx7xDD48\nvKXL36J5MwZ+/rnm3jk//1y5X0RE7dU5MTE+b1hQAN98U1mVtHy5bZQG25XHNzEkJQXt70sTQSgo\nKYH33rNn/x98YFurTj3VFv6//W3wVFQ2MeXlBeTkLPcp+L851G2ydeuBxMWNom3bUcTFnUaLFu52\nTww1paX2TL6m+vu8vMr9YmNrbqzt2bOB1TmlpbB6deXUGF9+abuxgu2i6psYBg601bBBQBNBc/bt\nt7bwf+012xuiSxe45hq49lp7tqLqpbT0ADk5X3oL/mXk56/yjn4NIybmBJ+C/1QiI7VXVWPIza25\nOic9vbI6B2xNZ00FfkKCwyfpHg9s3Fi1ATojwz7Xrp29Cq9IDEOH2q7ZLtBE0NwcOAD/+Y9NAKtW\n2dOaiy+2Z/9nn60DteqhuHgXOTnLDp3xFxR8CxhEooiNHXGo4I+NPZmIiJgjvp9qGGNsjWZN1TmZ\nmZX7RUTYZq7qhX2fPvbMPygYY3siVVwxLFtmMxlAy5YwYkRlYjj55Ea7WtdE0Bx4PPDxx7bwf+cd\nKC62/Z5vuAGuvBI6dnQ7wiahqGg72dlfHCr4Cws3AxAW1pq4uFMOFfwxMScSHh7tcrTNjzG2YF+/\n3t58C/7c3Mr9YmJqPrvv1Sv4e+fUaM8eW4VUccWwZo39TYeH27E7FV1WTz3Vsd+yJoKmbOvWyj7/\nP/1kLzX/8Afb5//444O2YSoYGGM4ePCHQ4V+Ts4XFBfvACAioh1xcacRF3cabduOok2b4xs23YGq\nVUEBfPddZaFfcauoTgdbk1l9ZG3fvnZ7s/7Tzsuzjc4VieGbb+zJHdiD4NvO0L17QD5SE0FTc/Ag\nvPWWPfv/7DP7izjnHFv1M3o0ROuZak2MKSc//9sqBX9p6V4AoqI6+9Tvj6J16wFNd7RskPF4YNu2\nwwv89PTKKapat7YXsL63gQPtDCUKmwQq1mZYtgy++sr2dQU7NLkiKZx/foMTgyaCpsAYO0fKrFnw\nxhv2jKFnz8o+/7pi02E8nhLy8lb5FPxfUV5ufzzR0T2qFPwtWybrYKsAyMmx/RN8C/xvv7VDVMCe\ns/TqVVnYH3ec/bdHj6DpPNM0lJfDhg1VG6B37rQL99x8c4PeUhNBMNu1C1591SaATZugVSvb3fO6\n6+wZgP56DikvP0hu7gqfrpzL8Xhsf+5WrfpV6coZHa2J82iUl0Na2uFn+RWzMYA9m69+lj9ggPZU\ndoQxtq9su3YNngNMF6YJNqWl8P77tvBftMj+6k45BV5+2Q78itHeKQBlZTnk5HxFTs4y78jdld7Z\nMYU2bYaQkDDmUFfOqKhj3A63ydq37/Cz/A0b7EqSYNsz+/SxHVxuvrmy0E9MbOb1+MFExHaXcogm\ngsa0YYMd7fvaa7YXQefOcNddts9/375uR+e6kpK9hwr9nJwvyM9fB3gQiSQmZhiJiXd4u3KeQmSk\nVi7XV0mJ7cVY/Szfd7RtfLytzhk3rrLA79dPm6WaO00ETsvOtnX+s2bBypW2I/To0bbu/9xzQ7rP\nf1HRjioF/8GD3wMQFtaS2NiT6dFjMnFxo4iNHUF4eCuXo206jLE1jtUL/O+/txejYMc09e8PZ51V\ntWqnCU2dowIodEshJ3k88OmntvB/+217jT1oEDz9tO36GR/vdoSNzhhDYWG6T8PuMoqKfgQgPDyW\nuLhT6dz5GuLiRhETM5SwMHdGXzY1hYV2UGv1Qr9iyn2wVTiDB8MFF1QW+L17N9H++MoRmggCads2\n2+d/9mzbqta2rT3zv+46O7Q8hCpUjfFQUPBdla6cJSV2Pb7IyHji4kaRmDiRuLhRtGkzyN1VrZoA\nY+wwkuoF/ubN9rwD7KDVgQPtIPOKAn/QIF1fSB2ZJoKjVVhoz/pnzYJPPrGF/Vlnwd/+Br/+dchU\nrno8peTnr/Gp6llGWZkdOdSiRSJt2555qCtnq1Z9tCtnHfLzbXNS9UK/ols52EkuBw+2HcwqCv1e\nvXTVUNUwmggawhhb31/R5z8nx/4yH3nETvh27LFuR+i48vIi8vL+d+hsPyfnazweuyBry5a9iY+/\nzDtydxTR0d214K+Bx2MHjlcv8LdsqdwnJsYW8ldeWXUgVtDMq6OaBU0E9bF7t+3xM2uWrZht2RJ+\n8xtb9fOLXzTrPv9lZXnk5i4/VPDn5q7AmBJAaN16EAkJ1xEXVzEdc2e3ww06Bw7UPBDr4EH7fFiY\nnRv/hBNsJ7KKQr9795CqUVQu0URwJKWlsHixLfzff9/Oe3vSSTBzpu3zHxfndoSO8HhK2b9/MdnZ\nn5OT8wV5eWuwi6CHExMzlMTECd6CfySRke3dDjdolJXZevvqZ/k7dlTu07697aJ5002VBX7//nYs\noVJucDQRiMh5wLNAOPCyMeZv1Z5/GjjD+7AVcIwxJjg6iG/caPv8v/qqvRLo1AkmTbJn//36uR2d\no7KzP2fz5nEcPLgRkRbExp5E9+5/8XblPEmXT6zGGDsh7OOP27P8irnDIiLs8JDTTqs65YLj8+Mr\nVU+OJQKx3UCmA2cDGcBKEVlojNlYsY8xZpLP/rcBxzsVj19ycuzavrNm2Xl/IiLgwgttz5/zzmv2\n/e2Ki3exdeuf2L37NaKjezBgwFt06PArwsJ0YfvabNkCt91mLxr797f3Kwr9vn2hhR461QQ4eUVw\nIpBujNkKICJvABcDG2vZ/wrgQQfjqZnHA59/bgv/t96yvYAGDIB//AOuugqOaf5TFxhTTmbmC/z4\n4/14PIV0734/xx77Zx3EVYfiYnjiCXjsMXu+8PTTMH58SI8PVE2Yk3+2XQGfmlEygBE17Sgi3YEk\n4JNanh8DjAE4NlA9crZvt3P8/+tftv9/bKzt8XP99TBsWMhcu+fmrmDz5nHk56+mXbuzSEmZTqtW\nusRlXZYuhVtvtZOy/e538NRTdplEpZqqYDl/uRyYb4wpr+lJY8xMYCbY2Ucb/CmFhbBggT37//hj\nW7l75pnw17/CJZfYXkAhorQ0i61b/8LOnS8RFZVA//7ziI//rXbzrMPPP8Mdd9jaw+RkWLLELhOh\nVFPnZCLIBHznAk70bqvJ5cCtDsYCr7wCd95p2wG6d4cHH7RXAD16OPqxwcYYD7t2zWbLlrspK8sm\nMXESPXo8pOvx1qGsDKZNg8mT7cRtDz8Md98dMmMFVQhwMhGsBFJEJAmbAC4Hrqy+k4j0BdoByx2M\nxS7sUtHwe/rpzbrPf23y89exefM4cnO/JjZ2JL17v0CbNoPcDiuoLV8OY8fCunW2v8C0aXYEr1LN\niWOJwBhTJiLjgSXY7qOzjDHficgjQKoxZqF318uBN4zTK+Scc07IXseXleWybduDZGQ8T2RkO/r0\n+RedO1+tSzXWISsL7r3XLhHRtSvMnw+XXhoyTUcqxDjaRmCMWQQsqrZtcrXHDzkZQygzxrBnzzy2\nbLmDkpJddOlyM0lJf9UBYHXweOycgXffbWcQv/NOW4uoawWp5ixYGotVgBUUbCIt7Vaysz+hTZuh\nDBz4LrGxw90OK6itX28XZPnqKxg5El54wc7eqVRzp4mgmSkvP8j27Y+yY8eThIW1IiVlBl26jNFp\nnuuQlwcPPQTPPmtnDp81y/YjCMFmJBWiNBE0I/v2LSQtbQLFxdvp1OlqevV6gqgoXXKqNsbYMYQT\nJ0JmJowZYweI6fz9KtRoImgGCgt/JD19AllZ79Gq1QCGDPmctm1HuR1WUEtPtyOBlyyBIUNsY/BJ\nJ7kdlVLu0ETQhHk8xfz009/56ae/AuH06vUkXbtOICysec+JdDSKimDqVDtBXFSUrQ4aN06nhlCh\nTf/8m6j9+5eSlnYrhYVpxMf/hl69niY6OtHtsILakiV2aogtW+Dyy+10Ul26uB2VUu7TRNDEFBdn\nkp5+B3v3/h8tWyYzePAHtG9/rtthBbXMTNsOMH++Xfzlww/h7LPdjkqp4KGJoInweErJzHyebdse\nxJgyevR4hG7d/kR4uM5zUJuyMnjuOTsOoKwMpkyBP/1Jp4ZWqjpNBE1AdvaXpKWNo6DgW9q3v4CU\nlOdp2bKn22EFta++snX/69fDBRfA889DTz1kStVIe0oHsZKSPWzadB1r155GWVkOAwa8w6BB72kS\nqMO+fXDDDXDqqbB/P7z9Nrz3niYBpeqiVwRByJhyfv75JX788c+Ulxdw7LH30r37/YSHt3Y7tKDl\n8diBYPfcA7m5tgpo8mRoo6tqKnVEmgiCTG5uKmlpY8nLS6Vt2zNISZlO69bNe43ko7VunZ0hdPly\nuz7wjBkwcKDbUSnVdGjVUJAoLT3A5s3jWL36RIqLM+jX798cd9zHmgTqkJsLkybBCSfY1cJmz7ar\njmoSUKp+9IrAZcYYdu9+lS1b7qK0NIuuXSeQlPQwERFxbocWtIyBN9+0SWDnTrj5ZrvIXHudVFWp\nBtFE4KL8/A2kpY0jJ2cZsbEnMXjwh8TEDHE7rKC2ebOdGmLpUjj+eNsYPKLGlbCVUv7SROCCsrI8\ntm17mIyMZ4iIaEufPi/TufN1ulBMHQoL7bQQU6faJSKff962C4TrpKpKHTVNBI3IGMPevfNJT59E\nSUkmCQk30bPn40RG6nSXdVm82F4FbN0KV14JTz4JCQluR6VU86GJoJEcPJhGWtp4Dhz4kDZthjBg\nwHzi4nS6y7rs2GHbAd56C/r0gY8+gjPPdDsqpZofTQQOKy8v5KefHuenn6YSFhZNcvJzdOkylrAw\nPfS1KS21s4I+9BCUl8Ojj8Jdd+nUEEo5RUsjB2VlvU9a2m0UFf3IMcf8gV69/k6LFlqnUZcvv7R1\n/xs2wK9+ZdsCkpLcjkqp5k0TgQOKiraTlnY7WVnv0qpVP4477hPatTvD7bCC2t69dsH42bPh2GNh\nwQIYPRpE3I5MqeZPE0EAeTwl7NjxFNu3PwIIPXv+jcTESYSFRbkdWtDyeODll+Hee+3awffcAw88\nAK11Ng2lGo0mggA5cOAT0tJu5eDBTXTseAnJyc8QHX2s22EFtTVrbDXQihXwi1/YqSH693c7KqVC\nj3ZcP0rFxTvZuPFK1q07E4+nhEGD3mfgwLc1CdQhNxduvx2GDbNdQufOhU8/1SSglFv0iqCBPJ4y\nfv55Oj/++AAeTzHdu0/m2GPvJTy8pduhBS1jYN48uOMO2LXLXg08+ii0a+d2ZEqFNk0EDZCTs5zN\nm8dSULCOdu3OJSXleVq1SnE7rKD2ww92veCPP4ahQ+Hdd2H4cLejUkqBJoJ6KSnZx9at97Jr1ytE\nRXVlwID5dOx4KaJdW2pVWAiPPQZPPAEtW8K0aXDLLTo1hFLBRBOBH4zxsHPnK2zdei/l5bl06/Yn\nunefTESErnpSl/ffh9tugx9/hKuugr//HTp3djsqpVR1mgiOIC9vDZs3jyUvbwVxcaeRkjKDNm10\nwvu6/PQTTJwI77wDffvCJ5/AGTqMQqmgpYmgFmVlOfz44wNkZk4nMrIjffvOpVOnq7QaqA6lpfD0\n0/Dww7Zh+LHH4M47IUqHUSgV1DQRVGOMYc+e10lPv5PS0j106TKOpKRHiYxs63ZoQe2LL2wvoI0b\n7YjgZ5+FHj3cjkop5Q9NBD4KCjaSlnYr2dmfERMznMGD3ycmZqjbYQW1PXvsQvFz50L37rY30OjR\nbkellKoPTQRAeXkB27ZNISPjH4SHx9C794skJNyIiHZtqU15Obz0Evz5z1BQYP+97z6dGkKppiik\nE4Exhn373iE9fSLFxTvo3Pk6evacSlRUvNuhBbXVq2010P/+B6efbqeG6NfP7aiUUg0VsomgsHAL\naWm3sX//Ylq3HkT//v8hLm6k22EFtexsOyHcjBnQsSO89ppdMUzbz5Vq2kIuEZSXF7Fjx1S2b3+c\nsLBIevV6iq5db9OFYupgDLz+uu0BtGcPjBtnp4Zoq+3nSjULIVX6ZWV9QFraeIqKtnDMMZfTq9c/\naNGii9thBbVNm2zB/+mndpK499+3U0QopZqPkJl9dPv2x/n22/MRCWfw4KX07/8fTQJ1KCmxjb+D\nB9s2gRkz4JtvNAko1Rw5ekUgIucBzwLhwMvGmL/VsM/vgIcAA6wzxlzpRCzx8ZcBHrp1u4uwMF38\n9kjuuw+efBL++Ec7NUSnTm5HpJRyihhjnHlj2/dyM3A2kAGsBK4wxmz02ScF+D/gl8aYAyJyjDFm\nT13vO2zYMJOamupIzMpau9ZWA11/Pcyc6XY0SqlAEJFVxphhNT3nZNXQiUC6MWarMaYEeAO4uNo+\nNwHTjTEHAI6UBJTzysthzBjo0AGmTnU7GqVUY/ArEYjI2yLyKxGpT+LoCuzweZzh3earN9BbRL4S\nkW+8VUnKRS+8ACtXwjPP6IIxSoUKfwv2GcCVQJqI/E1E+gTo8yOAFOB04ArgJRE5rFOiiIwRkVQR\nSd27d2+APlpVl5kJf/kLnHMOXH6529EopRqLX4nAGPORMeYPwAnANuAjEflaRK4TkchaXpYJdPN5\nnOjd5isDWGiMKTXG/IhtUzhsqS9jzExjzDBjzLD4eB3165QJE+wMojNm6CAxpUKJ31U9ItIBuBa4\nEViD7Q10ArC0lpesBFJEJElEooDLgYXV9lmAvRpARDpiq4q2+h++CpSFC+Htt+HBB6FXL7ejUUo1\nJr+6j4rIO0Af4FXgImPMTu9T80Skxi48xpgyERkPLMF2H51ljPlORB4BUo0xC73PnSMiG4Fy4E/G\nmKyj+0qqvvLzYfx4GDjQjh5WSoUWf8cRPGeM+bSmJ2rrjuR9bhGwqNq2yT73DXCH96ZcMnky7NgB\nb7wBkbVV9Cmlmi1/q4b6+zbiikg7ERnnUEyqEa1ebReRueUWOOUUt6NRSrnB30RwkzEmu+KBt9//\nTc6EpBpLWZkdM3DMMfD4425Ho5Ryi79VQ+EiIt6qnIpRw7oSbRM3fTqsWgXz5ulMokqFMn8TwQfY\nhuF/eh/f7N2mmqgdO+D+++H88+G3v3U7GqWUm/xNBPdgC/+x3sdLgZcdiUg1ittus9NJTJ+uYwaU\nCnV+JQJjjAd4wXtTTdyCBXaR+SeegKQkt6NRSrnN33EEKcDjQH8gumK7MaanQ3Eph+Tm2jEDgwfD\nxIluR6OUCgb+9hr6F/ZqoAw4A5gLvOZUUMo5DzwAP/9sp5fWMQNKKfA/EbQ0xnyMXb9guzHmIeBX\nzoWlnLByJTz/vF16csQIt6NRSgULfxuLi71TUKd5p43IBNo4F5YKtIoxA507w1//6nY0Sqlg4m8i\nuB1oBUwApmCrh65xKigVeM89Z1cee/NNiItzOxqlVDA5YiLwDh77vTHmLiAfuM7xqFRAbd9u2wYu\nvBAuu8ztaJRSweaIbQTGmHLg1EaIRTnAGNtLCGDaNB0zoJQ6nL9VQ2tEZCHwJlBQsdEY87YjUamA\neftteO89ePJJ6N7d7WiUUsHI30QQDWQBv/TZZgBNBEEsJ8eOIB4yBG6/3e1olFLByt+Rxdou0ATd\ndx/s2mVHEUf4m/KVUiHH35HF/8JeAVRhjLk+4BGpgFixwq49PH48DB/udjRKqWDm73niez73o4FL\ngJ8DH44KhNJSO2agSxd49FG3o1FKBTt/q4be8n0sIv8BvnQkInXUnnkG1q+3DcWxsW5Ho5QKdv5O\nMVFdCnBMIANRgbFtGzz4IIweDb/+tdvRKKWaAn/bCPKo2kawC7tGgQoixsCtt0JYmI4ZUEr5z9+q\noRinA1FH7803YdEiePpp6NbN7WiUUk2FX1VDInKJiMT5PG4rIlrxEESys+1YgRNOqBxJrJRS/vC3\njeBBY0xOxQNjTDbwoDMhqYb4859hzx67zoCOGVBK1Ye/iaCm/bS4CRLLl8OLL8KECTB0qNvRKKWa\nGn8TQaqIPCUivby3p4BVTgam/FMxZqBbN5gyxe1olFJNkb+J4DagBJgHvAEUAbc6FZTy3z/+ARs2\n2F5CbXSpIKVUA/jba6gAuNfhWFQ9bd0KDz8Ml1xixw0opVRD+NtraKmItPV53E5EljgXljoSY+za\nw5GRdh1ipZRqKH8bfDt6ewoBYIw5ICI6sthFb7wBS5bYJSi7dnU7GqVUU+ZvG4FHRI6teCAiPahh\nNlLVOA4cgIkT7ayi48a5HY1Sqqnz94rgPuBLEfkcEOA0YIxjUak63XsvZGXZK4LwcLejUUo1df42\nFn8gIsOwhf8aYAFQ6GRgqmZffmkHjd15p115TCmljpa/k87dCNwOJAJrgZOA5VRdulI5rKQEbr4Z\njj0WHnrI7WiUUs2Fv20EtwPDge3GmDOA44Hsul+iAu3JJ2HjRpg+XccMKKUCx99EUGSMKQIQkRbG\nmE1AH+fCUtWlp8Mjj8BvfgMXXuh2NEqp5sTfxuIM7ziCBcBSETkAbHcuLOXLGBg7Flq0gGefdTsa\npVRz429j8SXeuw+JyKdAHPCBY1GpKv79b/joI1sl1KWL29EopZqbes8gaoz53IlAVM2ysmDSJBgx\nwjYUK6VUoDV0zWK/iMh5IvKDiKSLyGFzFYnItSKyV0TWem83OhlPU3TPPXYA2cyZOmZAKeUMx9YU\nEJFwYDpwNpABrBSRhcaYjdV2nWeM0TW1avDFF/DKK3D33TB4sNvRKKWaKyevCE4E0o0xW40xJdjp\nqy928POaleJiWxXUowdMnux2NEqp5szJRNAV2OHzOMO7rbrLRGS9iMwXkRqXXBeRMSKSKiKpe/fu\ndSLWoPPEE7BpE8yYAa1bux2NUqo5c7SNwA//BXoYYwYDS4E5Ne1kjJlpjBlmjBkWHx/fqAG6YfNm\n+Otf4fe/h/PPdzsapVRz52QiyAR8z/ATvdsOMcZkGWOKvQ9fBkJ+xV1j4JZbIDoannnG7WiUUqHA\nyUSwEkgRkSQRiQIuBxb67mx2CBQAABQGSURBVCAiCT4PRwPfOxhPk/Dqq/DppzB1KnTu7HY0SqlQ\n4FivIWNMmYiMB5YA4cAsY8x3IvIIkGqMWQhMEJHRQBmwH7jWqXiagn374I474OST4aab3I5GKRUq\nxJimtb7MsGHDTGpqqtthOOK66+C112DNGhg40O1olFLNiYisMsYMq+k5txuLlddnn8Hs2fCnP2kS\nUEo1Lk0EQaCoyI4Z6NkT7r/f7WiUUqHGsTYC5b+//c12GV2yBFq1cjsapVSo0SsCl23aBI8/Dlde\nCeec43Y0SqlQpInARRVjBlq1gqeecjsapVSo0qohF82eDZ9/bmcW7dTJ7WiUUqFKrwhcsncv3HUX\nnHoq3HCD29EopUKZJgKX3Hkn5OXBP/8JYfq/oJRykRZBLvj4YzuVxN13Q//+bkejlAp1mggaWWGh\nbSBOTob77nM7GqWU0sbiRvfYY5CeDkuXQsuWbkejlFJ6RdCoNm60s4pedRWcdZbb0SillKWJoJF4\nPHYaiZgYHTOglAouWjXUSGbNgi+/tIvRh8Aia0qpJkSvCBrB7t12VtFRo+xU00opFUw0ETSCO+6A\nggI7ZkDE7WiUUqoqTQQO+/BDeP11+POfoW9ft6NRSqnDaSJwUGEhjB0LvXvbRKCUUsFIG4sdNGUK\nbN0Kn3wC0dFuR6OUUjXTKwKHbNgAf/87XHMNnHGG29EopVTtNBE4oGLMQFwcPPmk29EopVTdtGrI\nAS+9BF9/bdcb6NjR7WiUUqpumggCbNcuuOceWx109dVuR6OUM0pLS8nIyKCoqMjtUFQ10dHRJCYm\nEhkZ6fdrNBEE2MSJtrfQiy/qmAHVfGVkZBATE0OPHj0Q/UMPGsYYsrKyyMjIICkpye/XaRtBAC1e\nDPPm2emle/d2OxqlnFNUVESHDh00CQQZEaFDhw71vlLTRBAgBw/CuHF20Ng997gdjVLO0yQQnBry\n/6JVQwHy8MOwbZtdjL5FC7ejUUop/+kVQQCsXw//+Adcf72dWE4ppWry448/MmLECJKTk/n9739P\nSUnJYftkZWVxxhln0KZNG8aPH98ocWkiOEoeD4wZA+3awRNPuB2NUiqY3XPPPUyaNIn09HTatWvH\nK6+8ctg+0dHRTJkyhScbcRCSVg0dpRdfhBUr7GL0HTq4HY1SjW/iRFi7NrDvOWQIPPNM3ftMmTKF\n1157jfj4eLp168bQoUOJi4tj5syZlJSUkJyczKuvvkqrVq249tpradmyJWvWrGHPnj3MmjWLuXPn\nsnz5ckaMGMHs2bMBaNOmDWPHjmXRokUkJCTw2GOPcffdd/PTTz/xzDPPMHr0aLZt28Yf//hHCgoK\nAJg2bRqnnHLKEb+TMYZPPvmE119/HYBrrrmGhx56iLFjx1bZr3Xr1px66qmkp6fX/8A1kF4RHIWf\nf7aTyZ15JvzhD25Ho1ToWLlyJW+99Rbr1q1j8eLFpKamAnDppZeycuVK1q1bR79+/aqccR84cIDl\ny5fz9NNPM3r0aCZNmsR3333Ht99+y1pvJisoKOCXv/wl3333HTExMdx///0sXbqUd955h8mTJwNw\nzDHHsHTpUlavXs28efOYMGECAHl5eQwZMqTG28aNG8nKyqJt27ZERNjz78TERDIzMxvzsNVKrwiO\nwsSJUFwML7ygYwZU6DrSmbsTvvrqKy6++GKio6OJjo7moosuAmDDhg3cf//9ZGdnk5+fz7nnnnvo\nNRdddBEiwqBBg+jUqRODBg0CYMCAAWzbto0hQ4YQFRXFeeedB8CgQYNo0aIFkZGRDBo0iG3btgF2\nMN348eNZu3Yt4eHhbN68GYCYmJhDCaUm+/btc+JQBIQmggZ6/31480149FFISXE7GqUUwLXXXsuC\nBQs47rjjmD17Np999tmh51p4u/OFhYUdul/xuKysDIDIyMhD3S999/Pd5+mnn6ZTp06sW7cOj8dD\ntHdq4by8PE477bQa43r99dfp168f2dnZlJWVERERQUZGBl27dg3sAWggrRpqgIICO2agXz+7BKVS\nqnGNHDmS//73vxQVFZGfn897770H2MI4ISGB0tJS/v3vfzvy2Tk5OSQkJBAWFsarr75KeXk5UHlF\nUNOtf//+iAhnnHEG8+fPB2DOnDlcfPHFjsRYX5oIGuChh+Cnn2DmTIiKcjsapULP8OHDGT16NIMH\nD+b8889n0KBBxMXFMWXKFEaMGMHIkSPp69CSgOPGjWPOnDkcd9xxbNq0idatW/v92qlTp/LUU0+R\nnJxMVlYWN9xwAwALFy481AYB0KNHD+644w5mz55NYmIiGzduDPj38CXGGEc/INCGDRtmKhqG3LB2\nLQwbZscMzJzpWhhKuer777+nX79+rsaQn59PmzZtOHjwIKNGjWLmzJmccMIJrsYULGr6/xGRVcaY\nYTXtr20E9VBebscMdOgAU6e6HY1SoW3MmDFs3LiRoqIirrnmGk0CR0ETQT3MmAErV9rF6Nu1czsa\npUJbRX98dfQcbSMQkfNE5AcRSReRe+vY7zIRMSJS42VLMMjIsLOKnnMOXH6529EopVTgOJYIRCQc\nmA6cD/QHrhCR/jXsFwPcDqxwKpZAuP12KC21VwU6ZkAp1Zw4eUVwIpBujNlqjCkB3gBq6is1BZgK\nBO1SRwsXwttvw4MPQq9ebkejlFKB5WQi6Ars8Hmc4d12iIicAHQzxrxf1xuJyBgRSRWR1L179wY+\n0jrk5cGtt8LAgXDnnY360Uop1ShcG0cgImHAU8ARi1djzExjzDBjzLD4+Hjng/Px4IO2feCf/4R6\nLAGqlFKHmTZtGsnJyYhInVNOzJkzh5SUFFJSUpgzZ47jcTnZaygT6ObzONG7rUIMMBD4zDukuzOw\nUERGG2PcGyjgY9UqePZZuOUW8GNyQaWUqtPIkSO58MILOf3002vdZ//+/Tz88MOkpqYiIgwdOpTR\no0fTzsGuik4mgpVAiogkYRPA5cCVFU8aY3KAjhWPReQz4K5gSQJlZXbMQHw8PP6429EoFbzS0iaS\nnx/YeajbtBlCSkrds9k1tWmoAY4//vgj7rNkyRLOPvts2rdvD8DZZ5/NBx98wBVXXOHXZzSEY1VD\nxpgyYDywBPge+D9jzHci8oiIjHbqcwNl+nRYvdpeEbRt63Y0SilfTXEaan9lZmbSrVtlZUpjTFft\n6IAyY8wiYFG1bZNr2fd0J2Opjx074P774fzz4Xe/czsapYLbkc7cndAUp6EOZjqyuAa33Wank5g+\nXccMKNWUBPM01P37HzaMqkZdu3atEndGRkadbQqBoLOPVrNgAbz7rp1hNCnJ7WiUUjVpitNQ++vc\nc8/lww8/5MCBAxw4cIAPP/ywypWNEzQR+MjNhfHjYfBgmDTJ7WiUUrVpqtNQP/fccyQmJpKRkcHg\nwYO58cYbAUhNTT10v3379jzwwAMMHz6c4cOHM3ny5EMNx07Raah93H47PP88fP01nHSSIx+hVLOg\n01AHN52GuoFWrrRJYOxYTQJKNQU6DXXgaCKgcsxA587w2GNuR6OU8odOQx04mgiA556zK4+9+SbE\nxbkdjVJKNa6Qbyzevh0eeAB+9Su47DK3o1FKqcYX0onAGDuzKOiYAaVU6ArpqqG334b334cnn4Tu\n3d2ORiml3BGyVwQ5OXYE8ZAhttuoUko5rbZpqI0xTJgwgeTkZAYPHszq1atrfP2qVasYNGgQycnJ\nTJgwgUB1/w/ZRHDffbBrF8ycCREhfV2klGosI0eO5KOPPqJ7tSqIxYsXk5aWRlpaGjNnzmTs2LE1\nvn7s2LG89NJLh/b94IMPAhJXSBaBK1bYtYfHj4fhw92ORqkmbuJE2+0ukIYMgWdCZxrqd999l6uv\nvhoR4aSTTiI7O5udO3eSkJBwaJ+dO3eSm5vLSd6BTldffTULFizg/PPP9+uz6xJyVwSlpXbMQJcu\n8OijbkejlGqI5jYNtT9TT2dmZpKYmFjnPg0VclcEzzwD69fbhuLYWLejUaoZOMKZuxN0GurACqlE\nsG2bXYN49Gj49a/djkYpFWhNdRrqrl27smPHjkOPMzIy6Nq162H7ZGRk1LlPQ4VM1ZAxMG4chIXB\ntGk6ZkCppqy5TUM9evRo5s6dizGGb775hri4uCrtAwAJCQnExsbyzTffYIxh7ty5XHzxxQH5TiGT\nCN58ExYvtu0CPlVxSqkmqLlNQ33BBRfQs2dPkpOTuemmm5gxY8ah1wwZMuTQ/RkzZnDjjTeSnJxM\nr169AtJQDCE0DfWSJfDCCzB/vnYXVepo6TTUwU2noa7Fuefam1KqedBpqAMnZBKBUqp50WmoAydk\n2giUUoHV1KqVQ0VD/l80ESil6i06OpqsrCxNBkHGGENWVtahLq3+0qohpVS9VfR82bt3r9uhqGqi\no6OrjED2hyYCpVS9RUZGkpSU5HYYKkC0akgppUKcJgKllApxmgiUUirENbmRxSKyF9jewJd3BPYd\nca/Gp3HVj8ZVf8Eam8ZVP0cTV3djTHxNTzS5RHA0RCS1tiHWbtK46kfjqr9gjU3jqh+n4tKqIaWU\nCnGaCJRSKsSFWiKY6XYAtdC46kfjqr9gjU3jqh9H4gqpNgKllFKHC7UrAqWUUtVoIlBKqRDXrBOB\niPxWRL4TEY+I1NrlSkTOE5EfRCRdRO5thLjai8hSEUnz/tuulv3KRWSt97bQwXjq/P4i0kJE5nmf\nXyEiPZyKpZ5xXSsie32O0Y2NFNcsEdkjIhtqeV5E5Dlv3OtFpFFWTPEjrtNFJMfneE1uhJi6icin\nIrLR+1u8vYZ9Gv14+RlXox8v7+dGi8j/RGSdN7aHa9gnsL9JY0yzvQH9gD7AZ8CwWvYJB7YAPYEo\nYB3Q3+G4ngDu9d6/F5hay375jXCMjvj9gXHAi977lwPzgiSua4FpLvxdjQJOADbU8vwFwGJAgJOA\nFUES1+nAe418rBKAE7z3Y4DNNfw/Nvrx8jOuRj9e3s8VoI33fiSwAjip2j4B/U026ysCY8z3xpgf\njrDbiUC6MWarMaYEeAO42OHQLgbmeO/PAX7t8OfVxZ/v7xvvfOBMEZEgiMsVxpgvgP117HIxMNdY\n3wBtRSQhCOJqdMaYncaY1d77ecD3QNdquzX68fIzLld4j0O+92Gk91a9V09Af5PNOhH4qSuww+dx\nBs7/QXQyxuz03t8FdKplv2gRSRWRb0TEqWThz/c/tI8xpgzIATo4FE994gK4zFudMF9Eujkck7/c\n+Jvy18neKofFIjKgMT/YW31xPPYM15erx6uOuMCl4yUi4SKyFtgDLDXG1HrMAvGbbPLrEYjIR0Dn\nGp66zxjzbmPHU6GuuHwfGGOMiNTWh7e7MSZTRHoCn4jIt8aYLYGOtQn7L/AfY0yxiNyMPUP6pcsx\nBbPV2L+pfBG5AFgApDTGB4tIG+AtYKIxJrcxPtMfR4jLteNljCkHhohIW+AdERlojKmx7ScQmnwi\nMMacdZRvkQn4nkkmercdlbriEpHdIpJgjNnpvQTeU8t7ZHr/3Soin2HPWgKdCPz5/hX7ZIhIBBAH\nZAU4jnrHZYzxjeFlbNtLMHDkb+po+RZ0xphFIjJDRDoaYxydXE1EIrGF7b+NMW/XsIsrx+tIcbl1\nvKrFkC0inwLnAb6JIKC/Sa0agpVAiogkiUgUtuHFsR46XguBa7z3rwEOu3IRkXYi0sJ7vyMwEtjo\nQCz+fH/feH8DfGK8rVQOOmJc1eqRR2PreYPBQuBqb2+Yk4Acn6pA14hI54p6ZBE5Efv7dzShez/v\nFeB7Y8xTtezW6MfLn7jcOF7ez4r3XgkgIi2Bs4FN1XYL7G+ysVvEG/MGXIKtbywGdgNLvNu7AIt8\n9rsA22tgC7ZKyem4OgAfA2nAR0B77/ZhwMve+6cA32J7y3wL3OBgPId9f+ARYLT3fjTwJpAO/A/o\n2Uj/f0eK63HgO+8x+hTo20hx/QfYCZR6/75uAG4BbvE+L8B0b9zfUkuPNRfiGu9zvL4BTmmEmE7F\nNnSuB9Z6bxe4fbz8jKvRj5f3cwcDa7yxbQAme7c79pvUKSaUUirEadWQUkqFOE0ESikV4jQRKKVU\niNNEoJRSIU4TgVJKhThNBCpkiUj+kfeq9bXjvTM/Gu84j4rttc6kKSIJIvKez+MTReQLsTOsrhGR\nl0WklYhcKCKPNPybKVU/mgiUapivgLOA7dW2n4+dhiAFGAO84PPcHcBLACLSCdsP/B5jTB9jzPHA\nB9iZMN8HLhKRVo5+A6W8NBGokOc9i/+7iGwQkW9F5Pfe7WHeaQU2iV03YpGI/AbAGLPGGLOthrer\naybNy7CFPcCtwBxjzPKKFxpj5htjdhs7uOcz4EJHvrBS1WgiUAouBYYAx2HP8v/uLbwvBXoA/YE/\nAif78V41zqQpIknAAWNMsXf7QGBVHe+TCpxWj++gVINpIlDKTjfwH2NMuTFmN/A5MNy7/U1jjMcY\nsws7jUVDJQB767H/HuxUKEo5ThOBUoFV20yahdj5YSp8Bwyt432iva9RynGaCJSCZcDvvYuBxGOX\nfPwftkH4Mm9bQSfs0oVHUttMmpux1UwVpgHXiMiIig0icqn3cwB6U3XaYaUco4lAKXgHO9PjOuAT\n4G5vVdBb2Dr+jcBr2IVKcgBEZIKIZGDP+NeLyMve91oEbMXOCvkSdm1ZjDEFwBYRSfY+3o2dWvtJ\nb/fR74FzgTzv+5yB7T2klON09lGl6iAibYxdoaoD9iphpDdJNOS9LgGGGmPuP8J+nYDXjTFnNuRz\nlKqvJr9CmVIOe8+7SEgUMKWhSQDAGPOON6EcybHAnQ39HKXqS68IlFIqxGkbgVJKhThNBEopFeI0\nESilVIjTRKCUUiFOE4FSSoW4/wf9qGK3bmyvzAAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "G88a46NrYG_8",
        "colab_type": "text"
      },
      "source": [
        "### 9.3 原始数据进行tfidf处理后，再进行MinMaxScale处理, 再训练RBF核的SVM"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "9phPqeQ0YRqn",
        "colab_type": "code",
        "outputId": "91142d04-3c86-4e6b-aa35-3fbc3721d6b1",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "train_val_data_tfidf = TrainValData(tfidf_data_path)    # 解析tfidf数据，并切割成训练集和校验集\n",
        "search_tfidf = SVMSearch(train_val_data_tfidf)       # 创建超参数搜索对象\n",
        "search_tfidf.search()                     # 超参数C和gamma搜索\n",
        "search_tfidf.draw()                      # 画出不同超参数组合下的accuracy曲线\n",
        "search_tfidf.get_best_params()                 # 找到最佳参数组合\n",
        "search_tfidf.save_best_model(\"./model/Otto_tfidf_rbf_svc.pkl\") # 保存模型"
      ],
      "execution_count": 31,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "start trying: C=0.1, gamma=0.1\n",
            "accuracy=0.7095\n",
            "\n",
            "start trying: C=0.1, gamma=1.0\n",
            "accuracy=0.737\n",
            "\n",
            "start trying: C=0.1, gamma=10.0\n",
            "accuracy=0.367\n",
            "\n",
            "start trying: C=1.0, gamma=0.1\n",
            "accuracy=0.749\n",
            "\n",
            "start trying: C=1.0, gamma=1.0\n",
            "accuracy=0.7885\n",
            "\n",
            "start trying: C=1.0, gamma=10.0\n",
            "accuracy=0.6125\n",
            "\n",
            "start trying: C=10.0, gamma=0.1\n",
            "accuracy=0.779\n",
            "\n",
            "start trying: C=10.0, gamma=1.0\n",
            "accuracy=0.7865\n",
            "\n",
            "start trying: C=10.0, gamma=10.0\n",
            "accuracy=0.634\n",
            "\n",
            "start trying: C=100.0, gamma=0.1\n",
            "accuracy=0.794\n",
            "\n",
            "start trying: C=100.0, gamma=1.0\n",
            "accuracy=0.7645\n",
            "\n",
            "start trying: C=100.0, gamma=10.0\n",
            "accuracy=0.6325\n",
            "\n",
            "start trying: C=1000.0, gamma=0.1\n",
            "accuracy=0.7765\n",
            "\n",
            "start trying: C=1000.0, gamma=1.0\n",
            "accuracy=0.7685\n",
            "\n",
            "start trying: C=1000.0, gamma=10.0\n",
            "accuracy=0.6325\n",
            "\n",
            "best_C_=100.0, best_gamma_=0.1, best_accuracy_=0.794\n",
            "start fitting with best params...\n",
            "end fitting with best params.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYIAAAEGCAYAAABo25JHAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXzU1b3/8dcne9hJgmxhB3cWJYoW\ntWq1ilXQ2kX7qEur0mLRqr1V2qpVubf9dbnWa9Xe4gZqrbZ6QVCUYqvVWheiiAqoIIIEUUjYAyEk\nOb8/zoRMkkkyCfPNd5J5Px+PeeS7nJn5zCQ5n+/3nPM9X3POISIiqSst7ABERCRcSgQiIilOiUBE\nJMUpEYiIpDglAhGRFJcRdgCtVVBQ4IYOHRp2GCIiHcqbb75Z6pzrE2tfh0sEQ4cOpbi4OOwwREQ6\nFDNb19Q+NQ2JiKQ4JQIRkRQXaCIwszPN7AMzW21mM2LsH2xmL5jZUjN7x8zOCjIeERFpLLBEYGbp\nwN3AJOBw4EIzO7xBsRuBvzjnjgIuAO4JKh4REYktyDOCY4HVzrk1zrlK4DFgSoMyDugRWe4JfBpg\nPCIiEkOQiWAgsD5qvSSyLdotwLfNrARYCFwV64XMbKqZFZtZ8ebNm4OIVUQkZYXdWXwhMNs5Vwic\nBTxsZo1ics7Ncs4VOeeK+vSJOQxWRETaKMjrCDYAg6LWCyPbol0GnAngnHvVzHKAAmBTgHGJSDty\nDnbtgs8/h88+q/u5ZQvk5UFhIQwc6B8HHQTp6WFHnHqCTARLgFFmNgyfAC4AvtWgzCfAl4DZZnYY\nkAOo7UekA4hVuX/+eezlPXvie82MDOjfvy4xRCeJ6PWcnGA/W6oJLBE456rMbDqwCEgHHnDOLTez\n24Bi59x84EfAvWZ2Lb7j+FKnO+WIhKa8vOlKveG23bsbP98MCgqgXz/o2xdGjKhb7tu3/nJeHpSV\nwYYNdY+Skrrl5cth0SKfcBpqeCYRK2nk5fl4pGXW0erdoqIipykmgrN376eUlj5Faek8duz4d2Rr\nOmZ1j4brZhntVCYj5vPCKZPRaFtaWhZpadnt+vuKx+7d8Vfu5eWNn28G+flNV+i1y/36+SSQkeDD\nyx07YieK6OVNm3wTVLScnKYTRe1yv36QmZnYeJOVmb3pnCuKta/DzTUkiVde/j6lpfMoLZ3Hzp2v\nA5CbO5K+fS8mLS0b56pxrhqojixXxdjWdBnnqqip2XvAr1O7zZ88JqfMzL7k5g4nN3cEOTnDyc0d\nHvk5gqysfsQYC9Eme/bEX7nHOqKG+pX7scc2Xbn36ZP4yr01evTwj8MOa7pMZSVs3Nj02cXrr8OT\nT/py0cz852zp7KJ792A/Y9h0RpCCnKth584llJbOY/PmuezZ8wEA3bsXUVBwLgUF59Kly+FYkp5X\nO+fakFDqygVVpqZmDxUV66ioWMOePWvYu3c9ULM/7rS0HHJyhu1PDNFJIidnGPv25bbY1l77c+fO\n2N9NXl58R+59+qTOkXAt5+qaoqKTRMP1rVsbP7dHj5bPLvr0gbSwx2E2Q2cEQk1NJdu2vUhp6VxK\nS5+isnIjkE6vXidTWHgV+fmTyckZ1OLrJAMzizQjJfefb01NJTt2rOOzz9ZQVraG8vKP2LFjDbCG\n7Ox/kplZ/1C9tLQ/GzcO59NPR/Dpp8PZuNE/ystHkJPTl759jaOPrl+hR1fyffpAVlY4n7UjqO2/\nKCiAsWObLrd7d/0k0TBRPP+8T8bV1fWfl5npO7qbO7sYMCA5O7p1RtCJVVXtZMuWZyktnUdZ2TNU\nV+8gLa0LeXmTKCg4l/z8r5CZ2TvsMDu08nJ4911YtgxWrvTNE9FH7tu3x35er16OkSNLOfjgNQwd\nuoYBA9ZQUPARPXqsITt7DWlpJUQ3gaWl5e5vamrY7JSTM4z09CSsXTqx6mr/O27p7CJWn0tBQcuj\nonr1SnxHd3NnBEoEnUxl5eeUls6ntHQeW7c+j3OVZGYWkJ8/mYKCc+nd+zTS03PDDrPDcQ7WrYN3\n3vGVfu3jo4/qOim7dfP/xLGaYqK3HXQQZLfQp1xTs5eKirXs2bMm0tT0UdTyGmpq6tcwWVkDm+yb\nyMzsk7TNfJ2Zc76ju7lEUVICsSZLyM2NnShOOw0ObzhjW5zUNNTJ7d69OtLkM48dO14FHDk5wxg4\ncDoFBefSs+cXIqNbJB67d8N77/mKvrbif+eduqN7Mz8scuxYuOgi/3PsWBgyJHFHcWlp2XTpcghd\nuhzSaJ9zjn37NtVLDHv2fERFxRq2bFlMZeWGBq/VtV5iqFseTk7O0KQc6dQZmEHPnv5xxBFNl9u7\nt66jO1bSeOUV+PRT39E9a1bbE0GzseqMoONxzrFz55v7R/rs3r0cgG7djtrf2du162gdBbbAOVi/\nvvFR/qpV9Y/yx4ypq+zHjoUjj/Tbk1V19R4qKtY2ShK1iaOmJvrqLiM7u7CJJDGCzMx8/R0lgZoa\n39Gdne07rttCZwSdQE3NPrZvf2l/5b93bwmQRq9eJzFgwB0UFJxLTs6QsMNMWnv2+AuUaiv7d97x\nj+gRIsOH+4r+wgvrKv2hQ5N7JEgs6em5dO16GF27Nh5v6ZyjsvKzmEliy5ZnI4MIol+rezN9E0NI\nS1PvdHtIS/ODAYKiRJDEqqvL2bJlEaWlcykre5qqqm2kpeWSl3cGw4b9J3l5XyErqyDsMJOKc/6U\nuuFR/ocf+qMqgK5dYfRo+MY36h/lt/VIqyMxM7Kz+5Od3Z+ePSc22l9dvZuKio8b9U3s3v0+ZWUL\ncW5vVOk0srMHNdnslJGR12nPJvwQ5iqcq6SmpjLyc2/UciXONVxvuYy/3qayyTIDBlxJfv6khH8e\nJYIkU1m5mbKyBZHO3sXU1FSQkZFHfv4UCgrOJS/vy6Sndwk7zKRQUQErVtQ/yl+2zE9mVmvoUF/R\nf/3rdZX+8OEd7yi/vaSnd6Fr1yPo2rVxo7ZzNVRWbozZgV1W9jT79n3e4LV61ksM0WcU2dmDSUuL\nfSGDr2T3xaxEm69U4ynT+sq5qfcK4sJGsyzS0rIwy478zIr6mU11dYxhSAmgRJAE9uz5eH+Tz/bt\n/wJqyM4eTP/+UykoOI+ePU8gLS11f1XO+c606Mp+2TL44IO6sdy5uf4o//zz6yr80aN9R50khlka\n2dkDyc4eCJzYaH9V1S4qKj5u1OxUXv4eZWULIpVnrfTI61jMCjyY+LPrVar1K9q69fT07mRkNF8m\nntdpbRk/bUk4Z1CpW7uEyDnHrl3LIpX/XMrL3wGga9cxDBlyIwUF59Kt27hOe1rdnL17/Xj8hkf5\npaV1ZQYP9hX9eefVVfojRmj64rBlZHSjW7fRdOs2utE+56rZu/fTekli795PgLT9czTVrxgbV5jx\nlGmq4vXzQqXe/1O8lAjaSU1NFdu3/yuqs3cdYPTseQIjRvw3BQXnkps7POww29VnnzU+yn//faiq\n8vtzcnzb/ZQp9Y/ye+sauA7HLJ2cnEHk5AyiV68vhh2ONKBEEKDq6t1s3bo4UvkvoKqqDLNs8vJO\nZ+jQm8nPP5usrIPCDjNwlZW+gm94lL8p6vZDhYW+oj/nnLpKf9QoHeWLtAclggTbt6+MsrJnKC2d\ny5Yti6ip2UNGRi/y88+OXNl7BhkZSTwI/QBt2tT4KH/lSti3z+/PzvYX13zlK/WP8vPzw41bJJUp\nESRARcUn+5t8tm17CagmK2sg/fp9l4KCc+nV64tNjpDoqPbt8521DY/yP/usrsyAAb6inzSprtI/\n+OBwpzQWkcb0L9kGzjnKy9/bX/nv2vUWAF26HM7gwTMoKDiX7t3Hd5rOqdLSxkf5K1bUze2eleUv\nez/jjLoKf8wYP7mWiCQ/JYI4OVfN9u2v7q/8Kyo+AowePY5n+PBfU1AwhS5dDg47zAO2Zw+88Qa8\n9BK8+qqv9D/9tG5/v36+oj/99LpK/5BDUm9ue5HORImgGdXVFWzb9nc2b55LWdl89u3bjFkWvXt/\nicGDbyA//xyys/uFHeYB2b4d/v1vX/G//LJPAvv2+QmzjjgCvvSluiP8sWP9zJki0rkoETSwb99W\ntmxZGJnD/1lqaspJT+9Bfv5XIlf2nklGRsedi2DTJl/h11b8y5b5qRcyMqCoCK65Bk46CSZO1DBN\nkVShRABUVJRQVvZUpLP3RZyrIiurP/36XRTp7D2lw06utW6dr/RrK/4P/F0pyc2F44+Hm27yFf+E\nCX4OHhFJPSmZCJxz7N69MuqG7UsAyM09mMLCH9Gnz3l0735Mwm403l6c8+P1ayv9l17y0yyDv+PR\nCSfAd7/rK/6jj9ZtDUXES5lE4FwNO3a8vr/y37PnQwC6d5/AsGG/jMzhf2jIUbZOVZVv2qmt+F9+\nuW4qhn79fIV//fX+55FHaqI1EYktZRLB2rW3sm7dbZhl0KvXqRQWXkNBweTIxFcdQ0UFLFlSV/G/\n8grsitz/fPhwOPtsOPFE/xg5MvH3PBWRzillEsFBB11Aly6HkJd3FpmZvcIOJy47d/oRPbXNPG+8\n4SdlA3+Ef/HFdRX/wI6Tz0QkyaRMImjqjk3JZPNm+Ne/6ir+pUv9iJ70dBg/HqZPrxvRoykZRCRR\nUiYRJKP16+t37K5c6bfn5MBxx8HPfuYr/uOOS+575IpIx6ZE0E6c87dLjK74163z+3r08CN6Lr7Y\nV/zjx/vJ2URE2oMSQUCqq/3cPNEjemqnXT7oIF/hX3edb98fM0bTLYtIeJQIEmTvXigurjvaf+UV\n2LHD7xs61E/IdtJJ/jFqlEb0iEjyUCJoo127/KRstRX/66/74Z3gZ+K88EJf6Z94IgwaFG6sIiLN\nUSKIU1lZ/RE9b73lm3/S0vxVutOm+Yr/hBM0/bKIdCxKBE3YsKGu0n/pJVi+3G/Pzvbz8syY4Sv+\n44+H7t3DjVVE5EAoEeBH9KxeXX9WzjVr/L7u3f24/W99y1f8RUV+eKeISGeRkomgpgbefbd+xV97\ni8WCAt+uf9VVvuIfM0a3VhSRzi1lqriVK2HBgroRPdu2+e2DBvmbr9R27B56qEb0iEhqSZlEsHAh\n3HCDr+i//vW6in/IkLAjExEJV8okgu98By66SLdaFBFpKNAZ6s3sTDP7wMxWm9mMGPt/Z2ZvRx4f\nmtm2oGLJy1MSEBGJJbAzAjNLB+4GTgdKgCVmNt85t6K2jHPu2qjyVwFHBRWPiIjEFuQZwbHAaufc\nGudcJfAYMKWZ8hcCfw4wHhERiSHIRDAQWB+1XhLZ1oiZDQGGAf9oYv9UMys2s+LNmzcnPFARkVSW\nLHexvQB4wjlXHWunc26Wc67IOVfUp0+fdg5NRKRzCzIRbACip1srjGyL5QLULCQiEoogE8ESYJSZ\nDTOzLHxlP79hITM7FOgNvBpgLCIi0oTAEoFzrgqYDiwCVgJ/cc4tN7PbzGxyVNELgMeccy6oWERE\npGmBXlDmnFsILGyw7eYG67cEGYOIiDQvWTqLRUQkJEoEIiIpTolARCTFKRGIiKQ4JQIRkRSnRCAi\nkuKUCEREUpwSgYhIilMiEBFJcUoEIiIpTolARCTFKRGIiKQ4JQIRkRSnRCAikuKUCEREUpwSgYhI\nilMiEBFJcUoEIiIpTolARCTFKRGIiKQ4JQIRkRSnRCAikuKUCEREUlxG2AGIdGjbt8PHH/tHSQk4\nB2lp/pGefuDLiXiNeJfT0sAs7G9UQqBEINKcykr45BNYs6auwq9dXrMGtmwJO8LEMmvfBKTE0zrX\nXAOTJyf8ZZUIJLU5B59/Xr9yj/5ZUgI1NXXls7Jg6FAYNgyOOcb/HD7c/xw0yFdwNTVQXe1/Br3c\nXu8TVLzSOs4F8rJKBNL57dwZ+2i+dtuePfXLDxjgK/YvfrF+RT98uN+Xpq416VyUCKTj27cP1q+P\nXdGvWQOlpfXL9+jhK/ZDDoEzz6xf0Q8ZArm54XwOkZAoEUjycw42b266ol+/vn4zQ0ZGXfPN+ec3\nPqrv3Vtt0yJR4koEZvZ/wP3As865mpbKi7RaeXnzzTfl5fXL9+vnK/aJExtX9AMH+rZ6EYlLvGcE\n9wDfAe40s78CDzrnPgguLOl0qqp8x2tTR/WbNtUv362br9hHjIDTTqtf0Q8dCl26hPIxRDqjuBKB\nc+554Hkz6wlcGFleD9wLPOKc2xdgjNIROAdlZU1X9J984pNBrfR0GDzYV+yTJ9ev6IcNg4ICNd+I\ntJO4+wjMLB/4NnARsBT4E3ACcAlwchDBSZLZswfWrm16qOXOnfXL9+njK/Zjj4ULLmg81DJDXVQi\nySDePoK5wCHAw8A5zrmNkV2Pm1lxUMFJiLZtg9//Hj78sK6i37ixfpkuXeoq95NPbtx8061bGJGL\nSCvFe0h2p3PuhVg7nHNFCYxHkoFz8N3vwrx5dc03kyY1br456CA134h0AvEmgsPNbKlzbhuAmfUG\nLnTO3RNcaBKaxx6DuXPh17+GH/847GhEJGDxXiJ5RW0SAHDObQWuCCYkCdVnn8H06XDccXDddWFH\nIyLtIN5EkG5W1wZgZulAVktPMrMzzewDM1ttZjOaKPMNM1thZsvN7NE445EgOAff/z7s3g2zZ2ss\nvkiKiLdp6Dl8x/AfI+vfi2xrUiRZ3A2cDpQAS8xsvnNuRVSZUcBPgInOua1mdlBrP4Ak0KOPwlNP\nwW9/66dfEJGUEG8iuAFf+U+LrC8G7mvhOccCq51zawDM7DFgCrAiqswVwN2Rpiacc5savYq0j40b\n4aqr4Atf8FPdikjKiPeCshrgD5FHvAYC66PWS4AJDcocDGBmrwDpwC3OuUZnGmY2FZgKMHjw4FaE\nIHFxDr73PX+dwIMPqklIJMXEex3BKOCXwOFATu1259zwBLz/KPwFaYXAS2Y2OrpjOvI+s4BZAEVF\nRcFMyJ3KHnkEFiyA22+Hgw8OOxoRaWfxdhY/iD8bqAJOAR4CHmnhORuAQVHrhZFt0UqA+c65fc65\nj4EP8YlB2sunn8LVV8MJJ/ifIpJy4k0Euc65vwPmnFvnnLsF+EoLz1kCjDKzYWaWBVwAzG9QZh6R\n6SnMrADfVLQmzpjkQDkHU6fC3r1qEhJJYfF2Fu81szRglZlNxx/ZNzt/gHOuKlJ2Eb79/wHn3HIz\nuw0ods7Nj+z7spmtAKqBHzvnytr6YaSV5syBZ56BO+6AkSPDjkZEQmIujntgmtkxwEqgFzAT6AH8\nxjn3WrDhNVZUVOSKizW90QHbsAGOOALGjIEXX9TtF0U6OTN7s6kpgVo8I4hcD/BN59x/ALvw9yWQ\njsw5uOIKf4vHBx9UEhBJcS0mAudctZmd0B7BSDt58EF49lk/u+iIEWFHIyIhi7ePYKmZzQf+Cuy/\nZ6Bz7v8CiUqCs349XHutnzb6yivDjkZEkkC8iSAHKANOjdrmACWCjqS2Sai6Gu6/X01CIgLEf2Wx\n+gU6g/vvh0WL4O67/T0FRESI/8riB/FnAPU4576b8IgkGJ984qeVPuUUP8OoiEhEvE1DT0ct5wDn\nAZ8mPhwJhHNw+eX+5wMPqElIROqJt2noyeh1M/sz8K9AIpLEu/deWLwY/vAHfy9hEZEobT00HAXo\n3gEdwdq18KMfwZe+5GcYFRFpIN4+gp3U7yP4DH+PAklmtU1C4DuKdaN5EYkh3qah7kEHIgH44x/h\n73/3P4cMCTsaEUlScTUNmdl5ZtYzar2XmZ0bXFhywD7+GP7jP+D00/21AyIiTYi3j+DnzrnttSuR\nG8f8PJiQ5IDV1MBll/nRQffdpyYhEWlWvMNHYyWMeJ8r7e0Pf4AXXvCjhXRrTxFpQbxnBMVmdruZ\njYg8bgfeDDIwaaM1a+D66+GMM/xZgYhIC+JNBFcBlcDjwGNABfCDoIKSNqqpge9+FzIy1CQkInGL\nd9RQOTAj4FjkQN19N/zzn36oaGFh2NGISAcR76ihxWbWK2q9t5ktCi4sabWPPoIZM2DSJPiO5ggU\nkfjF2zRUEBkpBIBzbiu6sjh51NT4yj8z03cQq0lIRFoh3kRQY2b7h5+Y2VBizEYqIfn97+Hll+F/\n/gcGDgw7GhHpYOIdAvoz4F9m9k/AgBOBqYFFJfFbtQp+8hP4ylfg4ovDjkZEOqB4O4ufM7MifOW/\nFJgH7AkyMIlDdbVvEsrOhlmz1CQkIm0S76RzlwM/BAqBt4HjgFepf+tKaW933gmvvAIPPQQDBoQd\njYh0UPH2EfwQOAZY55w7BTgK2Nb8UyRQH34IP/0pnHMOfPvbYUcjIh1YvImgwjlXAWBm2c6594FD\nggtLmlVdDZdeCrm5fmZRNQmJyAGIt7O4JHIdwTxgsZltBdYFF5Y064474NVX4ZFHoH//sKMRkQ4u\n3s7i8yKLt5jZC0BP4LnAopKmvf8+/OxnMGUKfOtbYUcjIp1Aq2cQdc79M4hAJA61o4S6doX//V81\nCYlIQmgq6Y7k9tvhtdfg0UehX7+woxGRTqKtN6+X9rZyJdx0E5x3HlxwQdjRiEgnokTQEVRV+VFC\n3br5m86oSUhEEkhNQx3Bb38Lb7wBjz0GffuGHY2IdDI6I0h2y5fDz38OX/safOMbYUcjIp2QEkEy\nq20S6tHD33RGTUIiEgA1DSWzX/8aiovhL3+Bg3T7BxEJhs4IktW778Itt/jmoK9/PexoRKQTUyJI\nRvv2+SahXr3grrvCjkZEOrlAE4GZnWlmH5jZajObEWP/pWa22czejjwuDzKeDuNXv4K33vJDRfv0\nCTsaEenkAusjMLN04G7gdKAEWGJm851zKxoUfdw5Nz2oODqcd96B227zF42df37Y0YhICgjyjOBY\nYLVzbo1zrhJ4DJgS4Pt1fLVNQr17+/sQi4i0gyATwUBgfdR6SWRbQ+eb2Ttm9oSZDYr1QmY21cyK\nzax48+bNQcSaHH75S1i61E8oV1AQdjQikiLC7ixeAAx1zo0BFgNzYhVyzs1yzhU554r6dNY287ff\nhpkz/dTS553XcnkRkQQJMhFsAKKP8Asj2/ZzzpU55/ZGVu8DxgcYT/KqrPRNQgUF/j7EIiLtKMhE\nsAQYZWbDzCwLuACYH13AzKJvrzUZWBlgPMnrF7+AZcv8bSfz88OORkRSTGCjhpxzVWY2HVgEpAMP\nOOeWm9ltQLFzbj5wtZlNBqqALcClQcWTtJYuhf/6L38D+smTw45GRFKQOefCjqFVioqKXHFxcdhh\nJEZlJRQVQWkpvPce5OWFHZGIdFJm9qZzrijWPs01FKaZM/1UEgsWKAmISGjCHjWUut580w8Xvfhi\nOPvssKMRkRSmRBCGvXv9KKG+feGOO8KORkRSnJqGwnDbbb5P4Jln/FXEIiIh0hlBe1uyxE8q953v\nwFlnhR2NiIgSQbuqqPBNQv36we23hx2NiAigpqH2deutsGIFLFzo7zUgIpIEdEbQXt54w9968rLL\nYNKksKMREdlPiaA9VFTAJZfAgAHw3/8ddjQiIvWoaag9/Pzn8P77sGgR9OwZdjQiIvXojCBor70G\nv/0tXHEFfPnLYUcjItKIEkGQ9uzxo4QKC30yEBFJQmoaCtLNN8MHH8DixdCjR9jRiIjEpEQQlH//\n23cMf+97cNppYUcjklD79u2jpKSEioqKsEORBnJycigsLCQzMzPu5ygRBGH3bt8kNHgw/OY3YUcj\nknAlJSV0796doUOHYmZhhyMRzjnKysooKSlh2LBhcT9PfQRBuPFGWLUKHngAuncPOxqRhKuoqCA/\nP19JIMmYGfn5+a0+U1MiSLR//cvPKDptGpx6atjRiARGSSA5teX3okSQSLt3+8nkhgzxVxGLiHQA\nSgSJ9NOfwurV8OCD0K1b2NGISJL5+OOPmTBhAiNHjuSb3/wmlZWVjcqUlZVxyimn0K1bN6ZPn94u\ncSkRJMrLL8Odd8IPfgAnnxx2NCKShG644QauvfZaVq9eTe/evbn//vsblcnJyWHmzJn8th2vPdKo\noUQoL/dNQsOGwf/7f2FHI9KurrkG3n47sa85blzLN++bOXMmjzzyCH369GHQoEGMHz+enj17MmvW\nLCorKxk5ciQPP/wwXbp04dJLLyU3N5elS5eyadMmHnjgAR566CFeffVVJkyYwOzZswHo1q0b06ZN\nY+HChfTv359f/OIXXH/99XzyySfccccdTJ48mbVr13LRRRdRXl4OwF133cUXvvCFFj+Tc45//OMf\nPProowBccskl3HLLLUybNq1eua5du3LCCSewevXq1n9xbaQzgkT4yU/go4/8KCE1CYkEbsmSJTz5\n5JMsW7aMZ599luLiYgC++tWvsmTJEpYtW8Zhhx1W74h769atvPrqq/zud79j8uTJXHvttSxfvpx3\n332XtyOZrLy8nFNPPZXly5fTvXt3brzxRhYvXszcuXO5+eabATjooINYvHgxb731Fo8//jhXX301\nADt37mTcuHExHytWrKCsrIxevXqRkeGPvwsLC9mwYUN7fm1N0hnBgfrnP+H3v4err4YvfjHsaETa\nXRi33X7llVeYMmUKOTk55OTkcM455wDw3nvvceONN7Jt2zZ27drFGWecsf8555xzDmbG6NGj6du3\nL6NHjwbgiCOOYO3atYwbN46srCzOPPNMAEaPHk12djaZmZmMHj2atWvXAv5iuunTp/P222+Tnp7O\nhx9+CED37t33J5RYSktLg/gqEkKJ4EDs2uWbhEaMgF/8IuxoRFLepZdeyrx58xg7diyzZ8/mxRdf\n3L8vOzsbgLS0tP3LtetVVVUAZGZm7h9+GV0uuszvfvc7+vbty7Jly6ipqSEnJwfwZwQnnnhizLge\nffRRDjvsMLZt20ZVVRUZGRmUlJQwcODAxH4BbaSmoQMxYwasXetHCXXtGnY0Iilj4sSJLFiwgIqK\nCnbt2sXTTz8N+Mq4f//+7Nu3jz/96U+BvPf27dvp378/aWlpPPzww1RXVwN1ZwSxHocffjhmximn\nnMITTzwBwJw5c5gyZUogMbaWEkFbvfAC3H03/PCH0MRRgIgE45hjjmHy5MmMGTOGSZMmMXr0aHr2\n7MnMmTOZMGECEydO5NBDDw3kva+88krmzJnD2LFjef/99+naioPAX/3qV9x+++2MHDmSsrIyLrvs\nMgDmz5+/vw8CYOjQoVx33QH24kEAAA2ASURBVHXMnj2bwsJCVqxYkfDPEc2cc4G+QaIVFRW52o6h\n0OzaBaNHQ2amHy7RpUu48Yi0s5UrV3LYYYeFGsOuXbvo1q0bu3fv5qSTTmLWrFkcffTRocaULGL9\nfszsTedcUazy6iNoi+uvh3Xr/LUDSgIioZg6dSorVqygoqKCSy65REngACgRtNbf/w5/+ANcdx1M\nnBh2NCIpq3Y8vhw49RG0xs6dcNllcPDB8J//GXY0IiIJoTOC1vjxj2H9ej/DaG5u2NGIiCSEzgji\ntXgx/PGPvkno+OPDjkZEJGGUCOKxY4dvEjr0ULjttrCjERFJKCWCePzoR7BhA8yerSYhEWmzu+66\ni5EjR2JmzU45MWfOHEaNGsWoUaOYM2dO4HGpj6AlixbBfff5IaMTJoQdjYh0YBMnTuTss8/m5Gam\nqt+yZQu33norxcXFmBnjx49n8uTJ9O7dO7C4lAias307XH45HHYY3Hpr2NGIJKVVq65h167EzkPd\nrds4Ro1qfja7jjYNNcBRRx3VYplFixZx+umnk5eXB8Dpp5/Oc889x4UXXhjXe7SFmoaac9118Omn\nvkkoMrGUiISvI05DHa8NGzYwaNCg/evtMV11oGcEZnYm8D9AOnCfcy7mXVvM7HzgCeAY51zI80dE\nPPusv7/AjBlw7LFhRyOStFo6cg9CR5yGOpkFlgjMLB24GzgdKAGWmNl859yKBuW6Az8EXg8qllbb\nts03CR1xBNxyS9jRiEicknka6sMPPzyuzzBw4MB6cZeUlDTbp5AIQTYNHQusds6tcc5VAo8BseZc\nnQn8CqgIMJbWufZa+Pxz3yQU9QcjIsmhI05DHa8zzjiDv/3tb2zdupWtW7fyt7/9rd6ZTRCCTAQD\ngfVR6yWRbfuZ2dHAIOfcMwHG0TrPPOMTwIwZUBRzoj4RCVlHnYb6zjvvpLCwkJKSEsaMGcPll18O\nQHFx8f7lvLw8brrpJo455hiOOeYYbr755v0dx0EJbBpqM/sacKZz7vLI+kXABOfc9Mh6GvAP4FLn\n3FozexH4j1h9BGY2FZgKMHjw4PHr1q0LJGa2bvXNQfn5UFysswGRJmga6uSWTNNQbwAGRa0XRrbV\n6g4cCbwYaZPrB8w3s8kNk4FzbhYwC/z9CAKL+JprYNMmWLBASUAkyWka6sQJMhEsAUaZ2TB8ArgA\n+FbtTufcdqCgdr25M4J2sWABPPQQ3HQTjB8fSggiEj9NQ504gfUROOeqgOnAImAl8Bfn3HIzu83M\nJgf1vm2yZQtMnQpjxsCNN4YdjYhIuwr0OgLn3EJgYYNtNzdR9uQgY2nWD38IpaWwcCFkZYUWhohI\nGHRl8VNPwSOPwM9+BnFc/i0i0tmkdiIoK4PvfQ/GjoWf/jTsaEREQpHaieCqq3wymDNHTUIiErim\npqF2znH11VczcuRIxowZw1tvvRXz+W+++SajR49m5MiRXH311SRq+H/qJoK5c+HPf/ajhMaODTsa\nEUkBEydO5Pnnn2fIkCH1tj/77LOsWrWKVatWMWvWLKZNmxbz+dOmTePee+/dX/a5555LSFypOQ11\naSl8//u+T+AnPwk7GpGO7ZprINGTrY0bB3ekzjTUTz31FBdffDFmxnHHHce2bdvYuHEj/fv3319m\n48aN7Nixg+OOOw6Aiy++mHnz5jFp0qS43rs5qXlGMH26v4p4zhzIzAw7GhFppc42DXU8U09v2LCB\nwsLCZsu0VeqdETzxBDz+OMycCZFpaEXkALRw5B4ETUOdWKmVCDZvhiuv9FcO33BD2NGISIJ11Gmo\nBw4cyPr1dXN0lpSUMHDgwEZlSkpKmi3TVqnVNPSDH/jbT86erSYhkQ6ss01DPXnyZB566CGcc7z2\n2mv07NmzXv8AQP/+/enRowevvfYazjkeeughpkyJNbN/66VOIvjLX+Cvf4Wf/xyOPDLsaETkAHS2\naajPOusshg8fzsiRI7niiiu455579j9n3Lhx+5fvueceLr/8ckaOHMmIESMS0lEMAU5DHZSioiJX\n2zHUKn/7G9xzj+8jyEitFjGRRNM01MktmaahTi5f/rJ/iEinoGmoEyd1EoGIdCqahjpxUqePQEQS\nqqM1K6eKtvxelAhEpNVycnIoKytTMkgyzjnKysr2D2mNl5qGRKTVake+bN68OexQpIGcnJx6VyDH\nQ4lARFotMzOTYcOGhR2GJIiahkREUpwSgYhIilMiEBFJcR3uymIz2wysa+PTC4DSFku1P8XVOoqr\n9ZI1NsXVOgcS1xDnXJ9YOzpcIjgQZlbc1CXWYVJcraO4Wi9ZY1NcrRNUXGoaEhFJcUoEIiIpLtUS\nwaywA2iC4modxdV6yRqb4mqdQOJKqT4CERFpLNXOCEREpAElAhGRFNepE4GZfd3MlptZjZk1OeTK\nzM40sw/MbLWZzWiHuPLMbLGZrYr87N1EuWozezvymB9gPM1+fjPLNrPHI/tfN7OhQcXSyrguNbPN\nUd/R5e0U1wNmtsnM3mtiv5nZnZG43zGzdrljShxxnWxm26O+r5vbIaZBZvaCma2I/C/+MEaZdv++\n4oyr3b+vyPvmmNkbZrYsEtutMcok9n/SOddpH8BhwCHAi0BRE2XSgY+A4UAWsAw4POC4fg3MiCzP\nAH7VRLld7fAdtfj5gSuB/40sXwA8niRxXQrcFcLf1UnA0cB7Tew/C3gWMOA44PUkietk4Ol2/q76\nA0dHlrsDH8b4Pbb79xVnXO3+fUXe14BukeVM4HXguAZlEvo/2anPCJxzK51zH7RQ7FhgtXNujXOu\nEngMmBJwaFOAOZHlOcC5Ab9fc+L5/NHxPgF8ycwsCeIKhXPuJWBLM0WmAA857zWgl5n1T4K42p1z\nbqNz7q3I8k5gJTCwQbF2/77ijCsUke9hV2Q1M/JoOKonof+TnToRxGkgsD5qvYTg/yD6Ouc2RpY/\nA/o2US7HzIrN7DUzCypZxPP595dxzlUB24H8gOJpTVwA50eaE54ws0EBxxSvMP6m4nV8pMnhWTM7\noj3fONJ8cRT+CDdaqN9XM3FBSN+XmaWb2dvAJmCxc67J7ywR/5Md/n4EZvY80C/Grp85555q73hq\nNRdX9IpzzplZU2N4hzjnNpjZcOAfZvauc+6jRMfagS0A/uyc22tm38MfIZ0ackzJ7C3839QuMzsL\nmAeMao83NrNuwJPANc65He3xnvFoIa7Qvi/nXDUwzsx6AXPN7EjnXMy+n0To8InAOXfaAb7EBiD6\nSLIwsu2ANBeXmX1uZv2dcxsjp8CbmniNDZGfa8zsRfxRS6ITQTyfv7ZMiZllAD2BsgTH0eq4nHPR\nMdyH73tJBoH8TR2o6IrOObfQzO4xswLnXKCTq5lZJr6y/ZNz7v9iFAnl+2oprrC+rwYxbDOzF4Az\ngehEkND/STUNwRJglJkNM7MsfMdLYCN0IuYDl0SWLwEanbmYWW8zy44sFwATgRUBxBLP54+O92vA\nP1yklypALcbVoB15Mr6dNxnMBy6OjIY5Dtge1RQYGjPrV9uObGbH4v//A03okfe7H1jpnLu9iWLt\n/n3FE1cY31fkvfpEzgQws1zgdOD9BsUS+z/Z3j3i7fkAzsO3N+4FPgcWRbYPABZGlTsLP2rgI3yT\nUtBx5QN/B1YBzwN5ke1FwH2R5S8A7+JHy7wLXBZgPI0+P3AbMDmynAP8FVgNvAEMb6ffX0tx/RJY\nHvmOXgAObae4/gxsBPZF/r4uA74PfD+y34C7I3G/SxMj1kKIa3rU9/Ua8IV2iOkEfEfnO8DbkcdZ\nYX9fccbV7t9X5H3HAEsjsb0H3BzZHtj/pKaYEBFJcWoaEhFJcUoEIiIpTolARCTFKRGIiKQ4JQIR\nkRSnRCApy8x2tVyqyedOj8z86CLXedRub3ImTTPrb2ZPR60fa2YvmZ9hdamZ3WdmXczsbDO7re2f\nTKR1lAhE2uYV4DRgXYPtk/DTEIwCpgJ/iNp3HXAvgJn1xY8Dv8E5d4hz7ijgOfxMmM8A55hZl0A/\ngUiEEoGkvMhR/G/M7D0ze9fMvhnZnhaZVuB98/eNWGhmXwNwzi11zq2N8XLNzaR5Pr6yB/gBMMc5\n92rtE51zTzjnPnf+4p4XgbMD+cAiDSgRiMBXgXHAWPxR/m8ilfdXgaHA4cBFwPFxvFbMmTTNbBiw\n1Tm3N7L9SODNZl6nGDixFZ9BpM2UCET8dAN/ds5VO+c+B/4JHBPZ/lfnXI1z7jP8NBZt1R/Y3Iry\nm/BToYgETolAJLGamklzD35+mFrLgfHNvE5O5DkigVMiEIGXgW9GbgbSB3/LxzfwHcLnR/oK+uJv\nXdiSpmbS/BDfzFTrLuASM5tQu8HMvhp5H4CDqT/tsEhglAhEYC5+psdlwD+A6yNNQU/i2/hXAI/g\nb1SyHcDMrjazEvwR/ztmdl/ktRYCa/CzQt6Lv7cszrly4CMzGxlZ/xw/tfZvI8NHVwJnADsjr3MK\nfvSQSOA0+6hIM8ysm/N3qMrHnyVMjCSJtrzWecB459yNLZTrCzzqnPtSW95HpLU6/B3KRAL2dOQm\nIVnAzLYmAQDn3NxIQmnJYOBHbX0fkdbSGYGISIpTH4GISIpTIhARSXFKBCIiKU6JQEQkxSkRiIik\nuP8PPVJ+jK+JNb4AAAAASUVORK5CYII=\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "iTd7Su-5YMdU",
        "colab_type": "text"
      },
      "source": [
        "### 9.4 原始数据进行tfidf处理后，然后进行PCA降维，再进行MinMaxScale处理，接着训练RBF核的SVM"
      ]
    },
    {
      "cell_type": "code",
      "metadata": {
        "id": "grhrZr2H1Y7z",
        "colab_type": "code",
        "outputId": "71db62b6-7197-47bf-966f-8b32281028ac",
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 1000
        }
      },
      "source": [
        "train_val_data_tfidf_pca = TrainValData(tfidf_pca_data_path)  # 解析tfidf+pca处理后的数据，并切割成训练集和校验集\n",
        "search_tfidf_pca = SVMSearch(train_val_data_tfidf_pca)     # 创建超参数搜索对象\n",
        "search_tfidf_pca.search()                     # 超参数C和gamma搜索\n",
        "search_tfidf_pca.draw()                      # 画出不同超参数组合下的accuracy曲线\n",
        "search_tfidf_pca.get_best_params()                 # 找到最佳参数组合\n",
        "search_tfidf_pca.save_best_model(\"./model/Otto_tfidf_pca_rbf_svc.pkl\") # 保存模型"
      ],
      "execution_count": 32,
      "outputs": [
        {
          "output_type": "stream",
          "text": [
            "start trying: C=0.1, gamma=0.1\n",
            "accuracy=0.707\n",
            "\n",
            "start trying: C=0.1, gamma=1.0\n",
            "accuracy=0.7385\n",
            "\n",
            "start trying: C=0.1, gamma=10.0\n",
            "accuracy=0.518\n",
            "\n",
            "start trying: C=1.0, gamma=0.1\n",
            "accuracy=0.7465\n",
            "\n",
            "start trying: C=1.0, gamma=1.0\n",
            "accuracy=0.7805\n",
            "\n",
            "start trying: C=1.0, gamma=10.0\n",
            "accuracy=0.723\n",
            "\n",
            "start trying: C=10.0, gamma=0.1\n",
            "accuracy=0.767\n",
            "\n",
            "start trying: C=10.0, gamma=1.0\n",
            "accuracy=0.7845\n",
            "\n",
            "start trying: C=10.0, gamma=10.0\n",
            "accuracy=0.728\n",
            "\n",
            "start trying: C=100.0, gamma=0.1\n",
            "accuracy=0.785\n",
            "\n",
            "start trying: C=100.0, gamma=1.0\n",
            "accuracy=0.766\n",
            "\n",
            "start trying: C=100.0, gamma=10.0\n",
            "accuracy=0.7285\n",
            "\n",
            "start trying: C=1000.0, gamma=0.1\n",
            "accuracy=0.773\n",
            "\n",
            "start trying: C=1000.0, gamma=1.0\n",
            "accuracy=0.757\n",
            "\n",
            "start trying: C=1000.0, gamma=10.0\n",
            "accuracy=0.728\n",
            "\n",
            "best_C_=100.0, best_gamma_=0.1, best_accuracy_=0.785\n",
            "start fitting with best params...\n",
            "end fitting with best params.\n"
          ],
          "name": "stdout"
        },
        {
          "output_type": "display_data",
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAYgAAAEGCAYAAAB/+QKOAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAADh0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uMy4xLjEsIGh0\ndHA6Ly9tYXRwbG90bGliLm9yZy8QZhcZAAAgAElEQVR4nO3deXxU9bn48c+TkJBAIhA2k7AbZJGw\nSFxxbxG0FerSurQKdau0Xtve+7u37etatXrb2+X+autVQQQEVMS6IVo3FG1/tdoSZEcFRJQssoRF\nsiczz++PcyaZJJNkEubMTDLP+/Wa18w58z0zzwzkPPOc7/d8j6gqxhhjTHNJsQ7AGGNMfLIEYYwx\nJiRLEMYYY0KyBGGMMSYkSxDGGGNC6hHrACJlwIABOmLEiFiHYYwxXcr69esPqurAUM91mwQxYsQI\nCgsLYx2GMcZ0KSLyWWvP2SEmY4wxIVmCMMYYE5IlCGOMMSFZgjDGGBOSJQhjjDEhWYIwxhgTkiUI\nY4wxIXWb8yCMiSeqPny+Kvz+Sny+SlJTB5Kc3DvWYcUtVdi3D7Zsga1bneWhQ2HYMOc2eDAk2c/Z\nqLMEYRKKquL3VzfsuL26V61t9s5CevpJ9O49iYyMifTuPZGMjImkpY1AJLH2fMeOwbZtTjIIvpWV\ntb5NSkrThBG4Ba/LyIjeZ0gUliBMXFBVVOs83Wn7/ZX4/VWdii8pKZ2kpF4kJ/dqdp9JSsrgEOuD\n79OoqSmivHwzFRWbOXjwecC5UFdycga9e+c3JAznPp8ePfpE8NuNjbo62LGjaRLYuhU+/bSxTe/e\nMGECXH455Oc7twkTnITw+eewd69zH3x75x0oLgafr+n79evXMoEEJ5HsbOhhe7wOke5yRbmCggK1\nqTaio6ammJqakgjuuKvw+SoBX7vv3ZxIajs759bvk5LSw2ybFtFf+T5fBRUV2xoSRnn5JioqNlNf\nf6ShTc+ew8nImEhGxqSG5JGenodIcsTiiBRVZ0fePBF8+KGTJACSk2HMmMYkEEgEI0Z07tBRfT2U\nljYmjVCJ5PDhptskJ0NubttVSJ8+IHLcX0mXIiLrVbUg5HOWIEw4/P56yspepqRkPocPvxHGFkkk\nJ/fuxI473J12L/dXfff4Saiq1NQUuQljc8N9ZeXHBBJnUlIavXtPaFZtTCQlpX/U4jx8uOWhoa1b\n4csvG9sMHdoyEYwdCz17Ri1MwDmU1TxxBC/v3duYwAIyM9tOILm5kJoa3c/hNUsQptNqaoopLV1E\nScmj1NYWk5qaS07OLWRkTG1zBy6SgiTaTzEP+HzVVFZ+2CxxbKKu7kBDm9TUnKCE4VQcvXqNISkp\npdPvW13tVADNE0FxcWObvn2bJoHAfd++x/OJo8fvdzrGm1cewYnkwIGm24g4h6raSiL9+3etKsQS\nhOkQVT+HD79FScl8Dh5cDfjJyppBTs5tZGV9rdv8au/Kamv3UV6+qVm1sR1V5yexSAq9eo1vUmn0\n7j2R1NTBTRK33w+7dzdNAlu2wM6djcf4U1Nh/PjGJBC45eZ2rR1hZ1RWQlFR20mkurrpNunpbSeQ\noUMhLS02nycUSxAmLHV1ZZSWPkZp6SNUVe0iJWUAJ554Izk53yM9fVSswzPt8PvrqKz8uMVhqtra\nxp/9qgMpL5/I3r0T2bJlIu++O5GPPhpPXV0aIjBqVMtEMHq0de62RhUOHgydQAK3L75oud2gQW0n\nkUGDojes1xKEaZWq8uWX71FSMp/9+59BtYY+fc4hJ2ceAwdeSVJSlA8cm+NWXt44jHTrVti1q4zy\n8i3077+ZUaMCt6307OmM6FJNJilpDP36TaRPn8Zqo2fPIXaYMAJqapxDc20lkYqKptukprY/rLd3\nhE6rsQRhWqivP8a+fU9QUjKfiootJCdnMnjwDeTk3EZGxoRYh2fCUF/fOIw0cGhoyxbnkFFAr15w\nyilNK4L8fBg40EdV1a4mlUZFxWaqq/c0bNujR98WHeK9e0+wE/4iTNXp/A81EitwKylxDgcGy8pq\nTBYFBfDzn3fu/dtKEFY4Jpjy8k0UF89n//4n8fnKyciYwsknL2TQoGvp0cPONIpHqs5x8OaJ4MMP\nodY9Hy8pCU4+GaZOhblzGxPByJGtHapIplevMfTqNQb4ZsPa+vqjVFRsbZI4vvhiKT5fudtCSE/P\na5E4EvGEv0gRcXb2WVkwaVLoNnV1TpIINaQ3+LySiMdmFUT35/NVceDAM5SUzOfLL98nKSmNQYOu\nISdnHpmZp9lhhDhy5EjLRLB1q7M+IDe3ZUUwdqx3HZ+qfqqr97SoNqqqdpEIJ/x1d3aIKUFVVu6k\npGQBX3yxlPr6Q6SnjyEn5zZOPPEGUlKyYh1eQqupcSqA4ESwZYtTKQSccELL8wkmTHB+acaDlif8\nbaaiYlOTE/7S0ka0qDbi9YS/RGWHmBKI319HWdlqiovnc+TIW4j0YMCAy8nJmUffvhdYtRBlqrBn\nD2ze3DQR7NjROIw0JQXGjYPzz296TsHQofE9jDQ5uTcnnHA6J5xwesM654S/4hbnbZSV/Zl4OuHP\nhMcqiG6iunovpaWPUlq6iNraUnr2HEZOzq2ceOKN9OyZHevwEoKqczx4/frG2wcfwKFDjW1Gjmx5\nctnJJztJojsL74S/3IaEkZ5+Eqmpg0hJGURq6mBSUgZZH5lHrILoplT9HDr0BiUl8ykrexlQsrIu\nISfnEfr3v9TKeA+pOqOFmieDwPw/KSlOArjiCqfjePJkZzRRZmZs446V5OQ0MjOnkJk5pcl654S/\npn0bhw+/FWI2XEhK6uUmjcFNkkfwukAySUnJsv//EWAJoguqrT3AF18soaTkEaqrPyUlZRDDhv2E\n7OxbSE8fGevwup3myaCw0EkGgY7jlBSnErjqKicZTJ3qLEd77qGuKDV1MFlZ08nKmt6wzu+vo7b2\nC+rq9lNbu4/a2v0NjwP31dWfc+xYIbW1+wk9yWMSKSkDmySNUIkkcJ+cHEenNscRSxBdhKpy9Ojf\nKCmZz4EDz6FaS58+5zNy5K8YOPAKkpK62QxiMaIKn3zSsjIITgYTJ8K3vtWYDCZMsGQQSUlJKaSl\nDSUtbWi7bVX91NcfbjWROOv2UVW1m7q6/UHDdZtKTs5skTQaq5Om63r06JswfXmWIOJcff1Rvvji\ncUpKFlBZuY3k5D7k5NxGTs736N17fKzD69L8/tDJ4OhR5/nUVKcSsGQQv0SSSEnpT0pK/7D+Hny+\nyoakEZxQgtdVVe3k6NG/UVd3kMAw3qbvmeJWJ80TSahDX4OOa9LEWLMEEaeOHfuAkpL57Nu3Ar+/\nkszMAsaMWcSgQdfYmayd0DwZFBbChg1Nk8HEiXDNNU2TQXeb2jnRJSf3Ij19BOnpI9ptq+qjru4g\ntbXBVUljIgmsq6z8iLq6ffj91SFfp0ePfu0c5mpcl5ycGVfViSWIOOLzVbJ//9OUlCzg2LF/kpSU\nzqBB15GTcxsnnBBykIEJwe+HXbtaVgaBaxb07Okkg2uvbUwGp5xiycA0JZLsVgKDgfw226oqPl95\nm4mktnY/FRVbqa1dS339oZCvk5SU1uyQVsv+ksbqZIDnHfGWIOJARcVHlJQsYN++ZdTXH6FXr3Hk\n5f2RwYNvICWli0yuHyN+vzM1dXAy2LChaTKYNAm+/e2myaC7Dys10SUi9OiRSY8emUBeu+39/lq3\nOmlMHsF9J859KeXlG6mr298wjXuzdyUlZQApKYM44YQzGTt2UcQ/lyWIGPH7azl4cBUlJQs4cuRt\nRFIYMOAKcnPn0afPeXFVZsaL5skgcJjo2DHneUsGpqtISkqlZ88cevbMabetqlJff6SNRLLfs8PO\nliCirLr6M0pKFlJaupi6un307DmckSN/RXb2jW4pa8BJBjt2tKwMAskgLc1JBtdf35gMxo+3ZGC6\nHxEhJaUfKSn93MkVo8fTBCEiM4E/AsnAIlX9dbPn7wcudBd7AYNUta/7nA/Y4j73uarO8jJWL6n6\nOHToNUpKFlBW9gqg9O//NXJy5pGVNSPhT+jx+UIng3J3RGJamnOi2Q03NCaDceMsGRjjNc8ShDh7\nvYeA6UARsE5EVqvq9kAbVf1xUPt/AYJPs6xS1clexRcNtbX7KC11TmirqfmMlJTBDBv2M3JybiEt\nbXisw4uJ5smgsBA2bmxMBunpTmUwZ44zx30gGdgVzYyJPi//7E4HdqnqbgARWQnMBra30v5a4G4P\n44kKVeXIkb9QUrKAgwefR7WOvn0v5KSTfseAAd/o0mOiO8rng48/blkZBK6elZ7uVAZz5zatDCwZ\nGBMfvPxTzAX2Bi0XAWeEaigiw4GRwNqg1WkiUgjUA79W1VUhtrsVuBVg2LBhEQq7c+rqjrBv33L3\nhLYP6dGjL7m5PyA7+3v07j02prFFg88HH33UNBls3NiYDHr1cpLBjTc2JoOxYy0ZGBPP4uXP8xrg\nWVUNnlRluKoWi8goYK2IbFHVT4I3UtWFwEJwZnONXriNvvxyHSUlC9i//yn8/ioyM09nzJjHGDTo\napKT02MRkueaJ4PAYaLKSuf5Xr1gyhS46aamySA5sbtajOlyvEwQxUDwZCpD3HWhXAP8IHiFqha7\n97tF5B2c/olPWm4afT5fBfv3r6S4eD7l5etJSurF4MHfISfnNjIzT411eJ7ZsAGWLIEnn2yctTSQ\nDG6+ubHPYMwYSwbGdAdeJoh1wGgRGYmTGK4BrmveSETGAv2A94LW9QMqVbVGRAYA04DfehhrWCoq\ntrtXaFuOz3eUXr1OYfToBxk8+Dvd9tKKhw7BihWweLFTJfTs6UxhfckllgyM6e48SxCqWi8itwOv\n4wxzXaKq20TkXqBQVVe7Ta8BVmrTKxeNAx4RET+QhNMH0Vrntqf8/hoOHHiBkpL5HD36V0RSGTjw\nKnJy5tGnz7RueUKb3w9vveUkhRdegNpaOPVUePBBuO466Ncv1hEaY6LBrijXiqqqTyktDZzQdoC0\ntFHk5HyPE0/8LqmpAyP2PvFkzx547DFYuhQ+/9xJBN/5jtOxPLlLDzg2xrTGrigXJlUfZWWvUFIy\nn0OHXgOE/v0vIzd3Hv36TUckKdYhRlxVFaxa5VQLb73lXAN5+nT47W9h9mznJDVjTGKyBAHU1JRS\nWrqY0tKF1NTsJTU1m+HDf0529s1hXbSkq1F1OpwXL3b6F44cgREj4Be/cM5JiPGIYWNMnEj4BFFZ\nuYt168ahWk+/fl8lL+8P9O9/Wbc8oa2srLHDedMmp8P5yiud4agXXABJ3a9AMsYch4RPEOnpJzFy\n5K8YMGA2vXqdHOtwIs7na+xwXrXK6XCeOhUefti5OI51OBtjWpPwCUJEGDbs32MdRsR9+qnT2fzY\nY7B3L2RlwW23OR3OkybFOjpjTFeQ8AmiO6mqcoalLl4Ma9c6Hc4XXwz/9//CrFl2LWVjTMdYguji\nVJ3LaS5Z0tjhPHIk3Huv0+E8tPv1sRtjosQSRBdVVuZMebF4MWze7AxHDXQ4n3++dTgbY46fJYgu\nxOeDN990ksKLLzodzgUFTofztddCX7t8tTEmgixBdAG7dzee4VxUBP37w7x5TofzxImxjs408Pud\nKW2PHXOugBS4Ly93srlq483vt+W2lsHpRBNxyuHA49bWdXQ5Xl4jUq+ZlQXnnRfx/9KWIOJUVRU8\n/7xTLbz9tvP/YMYM+P3vrcM5IlSdnXbwTrz5jj3Ujr6tNoGLX3RV8bJTDcxv1lYCiUQS6sg28e6M\nM+D99yP+spYg4oiqc32FxYvhqafg6FGnw/m++5xLcCZ0h7PP5+yAO7rTbqtNfX14752UBJmZkJHR\n9H7IkMbl5s8F32dkQGpqfP+6Ne2LdNKJ5Guke3PtGUsQceDgwcYO5y1bnA7nq65yDiF1yQ5nVaiu\njsyv8kCbwNWIwpGe3nInnZXlzCESzg69eZu0NNuJGuf/QILNbW8JIkZ8PlizprHDua4OTjsN5s93\nznCO2w5nVXjgAfjnP9veoft87b8WOH9wmZktd84DBoS/Iw/eoWdkJNwfsTFesQQRZaE6nH/wA6da\nyM+PdXRhePtt+NGPnONd/fs7O+SBA51jYR3ZkQce9+xpv86NiVOWIKKgsrKxw/mdd5xDRjNmwP33\nw2WXdaEOZ1W4+27IzYUdO2wucGO6OUsQHlGFwsLGM5y//BJGjYL/+i+nw3nIkFhH2Alr18Lf/uZc\nWs6SgzHdniWICDtwAJ54wkkMW7c6/aWBDufzzuuCHc4BqnDPPU71cPPNsY7GGBMFliAiwOeDN95w\nDiGtXt3Y4bxggdPh3KdPrCOMgLfecqqHhx7qQsfEjDHHwxLEcfjkE6dSWLYMioudgTe33w7f/W4X\n6XAOV6B6GDLEmezJGJMQLEF0UGUlPPecUy385S+NHc5//KPT4ZyaGusIPfDmm/Duu86kT1Y9GJMw\nLEGEQRXWrXOSwsqVTofzSSfBL38JN9zQRTucwxVcPdx4Y6yjMcZEkSWINgQ6nBcvhm3bGjucb7oJ\nzj23C3c4d8Sbb8Lf/27VgzEJyBJEM/X1jR3OL73kdDiffjo88ghcfXU36XAOV+C8h6FDrXowJgFZ\ngnDt2tV4hnNJSWOH8403woQJsY4uRtasgffec+b/sOrBmIST8Ani88/h+uvhr391DhnNnAn/+7/w\n9a930w7ncAX6HoYOdYZlGWMSTsIniBNPdC4L8MtfOmc45+bGOqI48cYbTvWwYIFVD8YkqIRPEKmp\nzn7QBAlUD8OGWfVgTALzdByOiMwUkY9FZJeI/DTE8/eLyEb3tkNEjgQ9N0dEdrq3OV7GaZp54w3n\n6lT/+Z8JfpzNmMQm6tHl9EQkGdgBTAeKgHXAtaq6vZX2/wJMUdUbRSQLKAQKAAXWA1NV9XBr71dQ\nUKCFhYUR/hQJSBXOOgtKS2HnTksQxnRzIrJeVQtCPedlBXE6sEtVd6tqLbASmN1G+2uBp9zHM4A1\nqnrITQprgJkexmoCXn8d/vEPqx6MMZ4miFxgb9BykbuuBREZDowE1nZkWxG5VUQKRaTwwIEDEQk6\noQX6HoYPh7lzYx2NMSbG4uVc4GuAZ1U1zOtUOlR1oaoWqGrBwIEDPQotgbz2mlUPxpgGXiaIYmBo\n0PIQd10o19B4eKmj25pICK4e5tiYAGOMtwliHTBaREaKSCpOEljdvJGIjAX6AcGDTV8HLhaRfiLS\nD7jYXWe88tpr8M9/wp13WvVgjAE8PA9CVetF5HacHXsysERVt4nIvUChqgaSxTXASg0aTqWqh0Tk\nPpwkA3Cvqh7yKtaEF5hzacQIZ3paY4zB4xPlVPUV4JVm6+5qtnxPK9suAZZ4Fpxp9Oqrznzmjz5q\n1YMxpkG8dFKbWAn0PYwYYX0PxpgmEn6qjYT3yitO9bBoEaSkxDoaY0wcsQoikQWqh5Ejre/BGNOC\nVRCJ7M9/hsJC5+pIVj0YY5qxCiJRBVcP118f62iMMXHIKohE9ec/w/r1Vj0YY1plFUQiClQPo0ZZ\n9WCMaZVVEIno5Zed6mHJEqsejDGtsgoi0QRXD9/5TqyjMcbEMasgEs1LL8EHH8Bjj1n1YIxpU1gV\nhIg8LyJfExGrOLqyQPVw0klWPRhj2hXuDv9h4Dpgp4j8WkTGeBiT8cpLL8GGDc6MrT2seDTGtC2s\nBKGqb6rqt4FTgT3AmyLydxH5rojYcYquwKoHY0wHhX3ISET6A3OBm4ENwB9xEsYaTyIzkbV6tVM9\n/PznVj0YY8IS1p5CRF4AxgCPA5epaqn71NMiUuhVcCZCAtVDXh58+9uxjsYY00WE+1PyAVV9O9QT\nqloQwXiMF158ETZuhGXLrHowxoQt3ENM40Wkb2DBvRTo9z2KyUSSKvziFzB6NFx3XayjMcZ0IeEm\niFtU9UhgQVUPA7d4E5KJqED1YH0PxpgOCjdBJIuIBBZEJBmwa1PGO7/f6XsYPRquvTbW0Rhjuphw\nf1K+htMh/Yi7/D13nYlnL74ImzbB8uVWPRhjOkxUtf1GzhnU3wO+4q5aAyxSVZ+HsXVIQUGBFhba\ngKoGfj9MmQLV1bBtmyUIY0xIIrK+tcFGYe01VNUPzHdvpitYtQo2b4bHH7fkYIzplHDPgxgN/Dcw\nHkgLrFfVUR7FZY6H3++MXDr5ZLjmmlhHY4zposL9afkYcDdwP3Ah8F1sqvD49cILTvXwxBNWPRhj\nOi3cnXy6qr6F02fxmareA3zNu7BMp1n1YIyJkHB/Xta4HdU7ReR2oBjI8C4s02kvvABbtjjVQ3Jy\nrKMxxnRh4VYQPwR6AXcAU4HvAHO8Csp0UuC8hzFjrHowxhy3disI96S4q1X1/wDlOP0PJh49/zxs\n3QpPPmnVgzHmuLVbQbjnOpzTmRcXkZki8rGI7BKRn7bS5lsisl1EtonIiqD1PhHZ6N5Wd+b9E0qg\n72HsWLj66lhHY4zpBsLtg9jg7qSfASoCK1X1+dY2cCuPh4DpQBGwTkRWq+r2oDajgZ8B01T1sIgM\nCnqJKlWdHP5HSXDPPedUDytWWPVgjImIcBNEGlAGXBS0ToFWEwRwOrBLVXcDiMhKYDawPajNLcBD\n7uR/qOr+MOMxwYKrh299K9bRGGO6iXDPpO5Mv0MusDdouQg4o1mbkwFE5F0gGbhHVQNzPKW5FyOq\nB36tqquav4GI3ArcCjBs2LBOhNhNPPecM52GVQ/GmAgK90zqx3AqhiZU9cYIvP9o4AJgCPBXEcl3\npxYfrqrFIjIKWCsiW1T1k2bvvxBYCM5cTMcZS9cUqB7GjbPqwRgTUeEeYno56HEacDlQ0s42xcDQ\noOUh7rpgRcA/VLUO+FREduAkjHWqWgygqrtF5B1gCvAJpqlnn3Wqh6eesurBGBNRYc3m2mIj56S5\nv6nq2W206QHswJkBthhYB1ynqtuC2swErlXVOSIyANgATAb8QKWq1rjr3wNmB3dwN5eQs7n6/ZCf\n71w1bssWSxDGmA477tlcQxgNDGqrgarWu2ddv47Tv7BEVbeJyL1Aoaqudp+7WES2Az7g31W1TETO\nBh4RET/OUNxft5UcEtYzz8D27bBypSUHY0zEhXs9iGM07YP4AviZqj7nVWAdlXAVhM8HEyc6jzdv\ntgRhjOmUSFwPIjOyIZnj9uyzVj0YYzwV1lxMInK5iPQJWu4rIt/wLizTJp/PGbk0fjx885uxjsYY\n002FO1nf3ap6NLDgDkO925uQTLueeQY+/BDuvhuS7LIcxhhvhLt3CdXOrkQTC4Hq4ZRT4KqrYh2N\nMaYbC3cnXygiv8eZWwngB8B6b0IybfrTn+Cjj5x7qx6MMR4Kdw/zL0At8DSwEqjGSRImmnw+uPde\np3q48spYR2OM6ebCHcVUAYScrttE0dNPW/VgjImacEcxrRGRvkHL/UTkde/CMi0EqocJE6x6MMZE\nRbh9EAPckUsAhLh2g/Ha00/Dxx87I5isejDGREG4exq/iDTMpy0iIwgxu6vxSKB6yM+HK66IdTTG\nmAQRbgXxn8DfROQvgADn4l6HwUTBypVO9fDss1Y9GGOiJtxO6tdEpAAnKWwAVgFVXgZmXIHqYeJE\nuPzyWEdjjEkg4V4w6GbghzjXdNgInIkzBfdFbW1nIuCpp2DHDueqcVY9GGOiKNw9zg+B04DPVPVC\nnIv3HGl7E3Pc6uvhvvuc6uEbNvWVMSa6wu2DqFbVahFBRHqq6kciMsbTyIzT92DVgzEmRsJNEEXu\neRCrgDUichj4zLuwDPX1Tt/DpElWPRhjYiLcTupA7+g9IvI20Ad4zbOojNP3sHMnPP+8VQ/GmJjo\n8IysqvoXLwIxQQJ9D5MmwezZsY7GGJOgbMrueLRihVM9vPCCVQ/GmJixvU+8CVQPkydb9WCMiSmr\nIOLNihWwa5dTPYjEOhpjTAKzCiKeWPVgjIkjVkHEkyefdKqHVausejDGxJxVEPEiUD1MmQKzZsU6\nGmOMsQoibjzxBHzyCbz4olUPxpi4YBVEPKivh//6L6d6uOyyWEdjjDGAVRDx4fHHrXowxsQdqyBi\nra7OqR5OPdWqB2NMXPE0QYjITBH5WER2ichPW2nzLRHZLiLbRGRF0Po5IrLTvc3xMs6YeuIJ2L0b\n7rnHqgdjTFwRVW8uLS0iycAOYDpQBKwDrlXV7UFtRgN/Ai5S1cMiMkhV94tIFlAIFOBc+3o9MFVV\nD7f2fgUFBVpYWOjJZ/FMXR2MGQNZWbBunSUIY0zUich6VS0I9ZyXFcTpwC5V3a2qtcBKoPnZX7cA\nDwV2/Kq6310/A1ijqofc59YAMz2MNTYefxw+/dSqB2NMXPIyQeQCe4OWi9x1wU4GThaRd0XkfRGZ\n2YFtEZFbRaRQRAoPHDgQwdCjIND3MHUqfO1rsY7GGGNaiPUoph7AaOACnOtd/1VE8sPdWFUXAgvB\nOcTkRYCeWb7cqR4eeMCqB2NMXPKygigGhgYtD3HXBSsCVqtqnap+itNnMTrMbbuuQPVQUGDVgzEm\nbnmZINYBo0VkpIikAtcAq5u1WYVTPSAiA3AOOe0GXgcuFpF+ItIPuNhd1z0sXw579ljfgzEmrnl2\niElV60XkdpwdezKwRFW3ici9QKGqrqYxEWwHfMC/q2oZgIjch5NkAO5V1UNexRpVtbVO9XDaaXDp\npbGOxhhjWuXZMNdo6zLDXBctgltugT//2RKEMSbmYjXM1TRXWwu//CWcfjpcckmsozHGmDbFehRT\nYlm2zOl7ePhh63swxsQ9qyCiJbh6mNn9zvkzxnQ/VkFEy9Kl8NlnMH++VQ/GmC7BKohoCFQPZ5xh\n1YMxpsuwCiIali6Fzz+HRx6x6sEY02VYBeG14OphxoxYR2OMMWGzCsJrjz3mVA8LF1r1YIzpUqyC\n8FKgejjzTLj44lhHY4wxHWIVhJeWLIG9e+HRR616MMZ0OVZBeKWmBn71KzjrLKsejDFdklUQXnns\nMad6WLTIqgdjTJdkCcILNTVO38NZZ8H06bGOxpioqauro6ioiOrq6liHYppJS0tjyJAhpKSkhL2N\nJQgvLFkCRUXOvVUPJoEUFRWRmZnJiBEjEPu/HzdUlbKyMoqKihg5cmTY21kfRKQF+h7OPhu++tVY\nR2NMVFVXV9O/f39LDnFGRCtdRhUAABEGSURBVOjfv3+HKzurICJt8WKrHkxCs+QQnzrz72IVRCQF\nqodp06x6MMZ0eZYgImnxYigutmtNG2M65NNPP+WMM84gLy+Pq6++mtra2hZtysrKuPDCC8nIyOD2\n22+PSlyWICKlurqxevjKV2IdjTGmC/nJT37Cj3/8Y3bt2kW/fv1YvHhxizZpaWncd999/M///E/U\n4rI+iEgJVA/Llln1YAzwox/Bxo2Rfc3Jk+EPf2i7zX333ccTTzzBwIEDGTp0KFOnTqVPnz4sXLiQ\n2tpa8vLyePzxx+nVqxdz584lPT2dDRs2sH//fpYsWcLy5ct57733OOOMM1i6dCkAGRkZzJs3j1de\neYXs7Gx+9atf8R//8R98/vnn/OEPf2DWrFns2bOH66+/noqKCgAefPBBzj777HY/k6qydu1aVqxY\nAcCcOXO45557mDdvXpN2vXv35pxzzmHXrl0d/+I6ySqISKiuhv/+bzjnHLjoolhHY0zCWrduHc89\n9xybNm3i1VdfpbCwEIArrriCdevWsWnTJsaNG9fkF/rhw4d57733uP/++5k1axY//vGP2bZtG1u2\nbGGjm+EqKiq46KKL2LZtG5mZmdx5552sWbOGF154gbvuuguAQYMGsWbNGj744AOefvpp7rjjDgCO\nHTvG5MmTQ962b99OWVkZffv2pUcP5/f6kCFDKC4ujubX1iqrICJh0SKrHoxppr1f+l549913mT17\nNmlpaaSlpXHZZZcBsHXrVu68806OHDlCeXk5M4Km3r/ssssQEfLz8xk8eDD5+fkAnHLKKezZs4fJ\nkyeTmprKTPdiX/n5+fTs2ZOUlBTy8/PZs2cP4JwkePvtt7Nx40aSk5PZsWMHAJmZmQ2JJpSDBw96\n8VVEhCWI4xWoHs4916oHY+LU3LlzWbVqFZMmTWLp0qW88847Dc/17NkTgKSkpIbHgeX6+noAUlJS\nGoaJBrcLbnP//fczePBgNm3ahN/vJy0tDXAqiHPPPTdkXCtWrGDcuHEcOXKE+vp6evToQVFREbm5\nuZH9AjrJDjEdr0cfhZISG7lkTByYNm0aL730EtXV1ZSXl/Pyyy8Dzk46Ozuburo6nnzySU/e++jR\no2RnZ5OUlMTjjz+Oz+cDGiuIULfx48cjIlx44YU8++yzACxbtozZs2d7EmNHWYI4HsHVw4UXxjoa\nYxLeaaedxqxZs5g4cSKXXHIJ+fn59OnTh/vuu48zzjiDadOmMXbsWE/e+/vf/z7Lli1j0qRJfPTR\nR/Tu3TvsbX/zm9/w+9//nry8PMrKyrjpppsAWL16dUMfB8CIESP413/9V5YuXcqQIUPYvn17xD9H\nMFFVT98gWgoKCjTQIRU1//u/cMcdsHatJQhjgA8//JBx48bFNIby8nIyMjKorKzkvPPOY+HChZx6\n6qkxjSlehPr3EZH1qloQqr31QXRWVZVTPZx3HlxwQayjMca4br31VrZv3051dTVz5syx5HAcLEF0\n1qOPQmkprFhhfQ/GxJHA+QTm+HnaByEiM0XkYxHZJSI/DfH8XBE5ICIb3dvNQc/5gtav9jLODquq\ngl//Gs4/36oHY0y35VkFISLJwEPAdKAIWCciq1W1ea/K06oaamKRKlWd7FV8x2XhwsbqwRhjuikv\nK4jTgV2qultVa4GVQHyM3ToegerhggusejDGdGteJohcYG/QcpG7rrkrRWSziDwrIkOD1qeJSKGI\nvC8i3wj1BiJyq9um8MCBAxEMvQ0LF8IXX8Ddd0fn/YwxJkZifR7ES8AIVZ0IrAGWBT033B16dR3w\nBxE5qfnGqrpQVQtUtWDgwIHeR2vVgzHGAw8++CB5eXmISJtTbyxbtozRo0czevRoli1b1mq7SPFy\nFFMxEFwRDHHXNVDVsqDFRcBvg54rdu93i8g7wBTgE6+CDcsjjzjVw8qVMQ3DGNO9TJs2ja9//etc\n0MYPz0OHDvGLX/yCwsJCRISpU6cya9Ys+vXr51lcXiaIdcBoERmJkxiuwakGGohItqqWuouzgA/d\n9f2ASlWtEZEBwDSCkkdMVFXBb37jnBB3/vkxDcWYrmDnzh9RXh7Z+b4zMiYzenTbswB2tem+AaZM\nmdJum9dff53p06eTlZUFwPTp03nttde49tprw3qPzvDsEJOq1gO3A6/j7Pj/pKrbROReEZnlNrtD\nRLaJyCbgDmCuu34cUOiufxv4dYjRT9G1YIFTPdxzT0zDMMa0ritO9x2u4uJihg5tPCgTjWnBPT1R\nTlVfAV5ptu6uoMc/A34WYru/A/lextYhlZVO9XDRRc6Z08aYdrX3S98LXXG673hmZ1KH45FHYN8+\n+NOfYh2JMaYT4nm67/Hjx4f1GXJzc5vEXVRU1GafRSTEehRT/LPqwZguoytO9x2uGTNm8MYbb3D4\n8GEOHz7MG2+80aQS8oIliPYsWOBUD9b3YEzc66rTfT/wwAMMGTKEoqIiJk6cyM03O7MOFRYWNjzO\nysri5z//OaeddhqnnXYad911V0OHtVdsuu+2VFbCyJGQnw9vvhnZ1zamG7LpvuObTfcdSfPnw/79\nVj0Y04XYdN+RYwmiNRUV8Nvfwle/CuecE+tojDFhsum+I8f6IFoTqB5sziVjTIKyBBGKVQ/GGGMJ\nIqT58+HAAet7MMYkNEsQzQWqh+nTYdq0WEdjjDExYwmiuYcfturBGBNVrU33rarccccd5OXlMXHi\nRD744IOQ269fv578/Hzy8vK44447iNTpC5YgggWqh4svhjBnYTTGmOM1bdo03nzzTYYPH95k/auv\nvsrOnTvZuXMnCxcuZN68eSG3nzdvHo8++mhD29deey0icdkw12APPQQHD9rIJWMi4Uc/gkhPUjd5\nMvwhcab7fvHFF7nhhhsQEc4880yOHDlCaWkp2dnZDW1KS0v58ssvOfPMMwG44YYbWLVqFZdccklY\n790WqyACysvhd7+z6sGYLqy7TfcdzhTfxcXFDBkypM02nWUVRMDDDzvVg/U9GBMZ7fzS94JN9x1Z\nliCgsXqYMQPOOivW0RhjIqyrTvedm5vL3r17G5aLiorIzc1t0aaoqKjNNp1lh5igse/BqgdjurTu\nNt33rFmzWL58OarK+++/T58+fZr0PwBkZ2dzwgkn8P7776OqLF++nNmzZ0fkM1mCCFQPM2eC28lj\njOmautt035deeimjRo0iLy+PW265hYcffrhhm8mTJzc8fvjhh7n55pvJy8vjpJNOikgHNdh031BS\nAj/8Ifzbv1mCMOY42XTf8c2m++6onBx45plYR2GMiRCb7jtyLEEYY7oVm+47cqwPwhgTUd3lsHV3\n05l/F0sQxpiISUtLo6yszJJEnFFVysrKGobehssOMRljIiYwEufAgQOxDsU0k5aW1uSM63BYgjDG\nRExKSgojR46MdRgmQuwQkzHGmJAsQRhjjAnJEoQxxpiQus2Z1CJyAPjsOF5iAHCw3VbRZ3F1jMXV\nMRZXx3THuIar6sBQT3SbBHG8RKSwtdPNY8ni6hiLq2Msro5JtLjsEJMxxpiQLEEYY4wJyRJEo4Wx\nDqAVFlfHWFwdY3F1TELFZX0QxhhjQrIKwhhjTEiWIIwxxoSUsAlCRL4pIttExC8irQ4PE5GZIvKx\niOwSkZ9GIa4sEVkjIjvd+36ttPOJyEb3ttrDeNr8/CLSU0Sedp//h4iM8CqWDsQ0V0QOBH0/N3sd\nk/u+S0Rkv4hsbeV5EZEH3Lg3i0hUrmQTRlwXiMjRoO/rrijFNVRE3haR7e7f4g9DtIn6dxZmXFH/\nzkQkTUT+KSKb3Lh+EaJNZP8eVTUhb8A4YAzwDlDQSptk4BNgFJAKbALGexzXb4Gfuo9/CvymlXbl\nUfiO2v38wPeBBe7ja4Cn4yCmucCDMfg/dR5wKrC1lecvBV4FBDgT+EecxHUB8HIMvq9s4FT3cSaw\nI8S/ZdS/szDjivp35n4HGe7jFOAfwJnN2kT07zFhKwhV/VBVP26n2enALlXdraq1wEpgtsehzQaW\nuY+XAd/w+P3aEs7nD473WeArIiIxjikmVPWvwKE2mswGlqvjfaCviGTHQVwxoaqlqvqB+/gY8CGQ\n26xZ1L+zMOOKOvc7KHcXU9xb81FGEf17TNgEEaZcYG/QchHe/0cZrKql7uMvgMGttEsTkUIReV9E\nvEoi4Xz+hjaqWg8cBfp7FE+4MQFc6R6SeFZEhnoYT0fE4v9TuM5yD128KiKnRPvN3UMhU3B+FQeL\n6XfWRlwQg+9MRJJFZCOwH1ijqq1+X5H4e+zW14MQkTeBE0M89Z+q+mK04wloK67gBVVVEWltHPJw\nVS0WkVHAWhHZoqqfRDrWLuol4ClVrRGR7+H8orooxjHFsw9w/j+Vi8ilwCpgdLTeXEQygOeAH6nq\nl9F63/a0E1dMvjNV9QGTRaQv8IKITFDVkH1LkdCtE4SqfvU4X6IYCP71OcRdd1zaiktE9olItqqW\nuqX0/lZeo9i93y0i7+D8yol0ggjn8wfaFIlID6APUBbhODoUk6oGv/8inH6deODJ/6fjFbzzU9VX\nRORhERmgqp5PSiciKTg74SdV9fkQTWLynbUXVyy/M/c9j4jI28BMIDhBRPTv0Q4xtW0dMFpERopI\nKk6nj2cjhlyrgTnu4zlAi0pHRPqJSE/38QBgGrDdg1jC+fzB8V4FrFW3h8wj7cbU7Bj1LJxjyPFg\nNXCDOzLnTOBo0OHEmBGREwPHqUXkdJz9gpdJPvC+AiwGPlTV37fSLOrfWThxxeI7E5GBbuWAiKQD\n04GPmjWL7N9jNHvh4+kGXI5zPLMG2Ae87q7PAV4JancpziiGT3AOTXkdV3/gLWAn8CaQ5a4vABa5\nj88GtuCM4NkC3ORhPC0+P3AvMMt9nAY8A+wC/gmMisJ31F5M/w1sc7+ft4GxUfo/9RRQCtS5/7du\nAm4DbnOfF+AhN+4ttDJ6LgZx3R70fb0PnB2luM7B6WTdDGx0b5fG+jsLM66of2fARGCDG9dW4C53\nvWd/jzbVhjHGmJDsEJMxxpiQLEEYY4wJyRKEMcaYkCxBGGOMCckShDHGmJAsQRgTgoiUt9+q1W1v\nd2fTVPc8lcD6VmcmFZFsEXk5aPl0EfmrOLPWbhCRRSLSS0S+LiL3dv6TGRM+SxDGRN67wFeBz5qt\nvwRnOobRwK3A/KDn/hV4FEBEBuOMZf+Jqo5R1SnAazgzi/4ZuExEenn6CYzBEoQxbXJ/9f9ORLaK\nyBYRudpdn+ROr/CRONfteEVErgJQ1Q2quifEy7U1M+mVOEkA4AfAMlV9L7Chqj6rqvvUOXHpHeDr\nnnxgY4JYgjCmbVcAk4FJOFXB79yd+hXACGA8cD1wVhivFXJmUhEZCRxW1Rp3/QRgfRuvUwic24HP\nYEynWIIwpm3n4MwM61PVfcBfgNPc9c+oql9Vv8CZ0qOzsoEDHWi/H2dKGGM8ZQnCmOhpbWbSKpw5\ndAK2AVPbeJ00dxtjPGUJwpi2/T/gavdCLQNxLt/5T5yO6CvdvojBOJegbE9rM5PuwDlcFfAgMEdE\nzgisEJEr3PcBOJmmUzwb4wlLEMa07QWc2TM3AWuB/3APKT2H04ewHXgC5wIyRwFE5A4RKcKpEDaL\nyCL3tV4BduPMtPkozvWDUdUK4BMRyXOX9+FMY/4/7jDXD4EZwDH3dS7EGc1kjKdsNldjOklEMtS5\nolh/nKpimps8OvNalwNTVfXOdtoNBlao6lc68z7GdES3vqKcMR572b2ASypwX2eTA4CqvuAmmvYM\nA/6ts+9jTEdYBWGMMSYk64MwxhgTkiUIY4wxIVmCMMYYE5IlCGOMMSFZgjDGGBPS/wdHSr5htnfg\nlgAAAABJRU5ErkJggg==\n",
            "text/plain": [
              "<Figure size 432x288 with 1 Axes>"
            ]
          },
          "metadata": {
            "tags": []
          }
        }
      ]
    },
    {
      "cell_type": "markdown",
      "metadata": {
        "id": "CI-NnBJR5qDV",
        "colab_type": "text"
      },
      "source": [
        "### 9.4 对比分析\n",
        "\n",
        "#### 9.4.1 原始数据 **VS** 原始数据+PCA\n",
        "+ 从accuracy上来看，相差不大;\n",
        "+ 从波动情况来看，经过PCA后的波动情况更低，更稳定；\n",
        "\n",
        "#### 9.4.2 原始数据+tfidf **VS** 原始数据+tfidf+PCA\n",
        "+ 对于gamma较小时，两者的效果差不多;\n",
        "+ 对于gamma=10.0时，经过PCA降维后，accuracy明显要比仅仅tfidf处理的效果好。\n",
        "\n",
        "不过这里对PCA带来的好处还是不够理解，希望老师能给出指导意见。我的理解是PCA是主成分分析，只留下部分主成分，这样会忽视一些次要成分带来的影响，所以在调优的时候，可能不一定能起到很好的提升作用，但泛化能力会更强？？？不知道理解的对不对。"
      ]
    }
  ]
}